CRBs as modifiers of branching morphogenesis and methods of use

ABSTRACT

Human CRB genes are identified as modulators of branching morphogenesis, and thus are therapeutic targets for disorders associated with defective branching morphogenesis function. Methods for identifying modulators of branching morphogenesis, comprising screening for agents that modulate the activity of CRB are provided.

REFERENCE TO RELATED APPLICATIONS

[0001] This application claims priority to U.S. provisional patent application No. 60/333,388 filed Nov. 26, 2001. The contents of the prior application are hereby incorporated in their entirety.

BACKGROUND OF THE INVENTION

[0002] Several essential organs (e.g., lungs, kidney, lymphatic system and vasculature) are made up of complex networks of tube-like structures that serve to transport and exchange fluids, gases, nutrients and waste. The formation of these complex branched networks occurs by the evolutionarily conserved process of branching morphogenesis, in which successive ramification occurs by sprouting, pruning and remodeling of the network. During human embryogenesis, blood vessels develop via two processes: vasculogenesis, whereby endothelial cells are born from progenitor cell types; and angiogenesis, in which new capillaries sprout from existing vessels.

[0003] Branching morphogenesis encompasses many cellular processes, including proliferation, survival/apoptosis, migration, invasion, adhesion, aggregation and matrix remodeling. Numerous cell types contribute to branching morphogenesis, including endothelial, epithelial and smooth muscle cells, and monocytes. Gene pathways that modulate the branching process function both within the branching tissues as well as in other cells, e.g., certain monocytes can promote an angiogenic response even though they may not directly participate in the formation of the branch structures.

[0004] An increased level of angiogenesis is central to several human disease pathologies, including rheumatoid arthritis and diabetic retinopathy, and, significantly, to the growth, maintenance and metastasis of solid tumors (for detailed reviews see Liotta L A et al, 1991 Cell 64:327-336; Folkman J., 1995 Nature Medicine 1:27-31; Hanahan D and Folkman J, 1996 Cell 86:353-364). Impaired angiogenesis figures prominently in other human diseases, including heart disease, stroke, infertility, ulcers and scleroderma.

[0005] The transition from dormant to active blood vessel formation involves modulating the balance between angiogenic stimulators and inhibitors. Under certain pathological circumstances an imbalance arises between local inhibitory controls and angiogenic inducers resulting in excessive angiogenesis, while under other pathological conditions an imbalance leads to insufficient angiogenesis. This delicate equilibrium of pro- and anti-angiogenic factors is regulated by a complex interaction between the extracellular matrix, endothelial cells, smooth muscle cells, and various other cell types, as well as environmental factors such as oxygen demand within tissues. The lack of oxygen (hypoxia) in and around wounds and solid tumors is thought to provide a key driving force for angiogenesis by regulating a number of angiogenic factors, including Hypoxia Induced Factor alpha (HIF1 alpha) (Richard D E et al., Biochem Biophys Res Commun. Dec. 29, 1999; 266(3):718-22). HIF1 in turn regulates expression of a number of growth factors including Vascular Endothelial Growth Factor (VEGF) (Connolly D T, J Cell Biochem November 1991;47(3):219-23). Various VEGF ligands and receptors are vital regulators of endothelial cell proliferation, survival, vessel permeability and sprouting, and lymphangiogenesis (Neufeld G et al., FASEB J January 1999;13(1):9-22; Stacker S A et al., Nature Medicine 2001 7:186-191; Skobe M, et al., Nature Medicine 2001 7:192-198; Makinen T, et al., Nature Medicine 2001 7:199-205).

[0006] Most known angiogenesis genes, their biochemical activities, and their organization into signaling pathways are employed in a similar fashion during angiogenesis in human, mouse and Zebrafish, as well as during branching morphogenesis of the Drosophila trachea. Accordingly, Drosophila tracheal development and zebrafish vascular development provide useful models for studying mammalian angiogenesis (Sutherland D et al., Cell 1996, 87:1091-101; Roush W, Science 1996, 274:2011; Skaer H., Curr Biol 1997, 7:R238-41; Metzger R J, Krasnow M A. Science. 1999. 284:1635-9; Roman B L, and Weinstein B M. Bioessays 2000, 22:882-93).

[0007] The Drosophila cell-polarity gene crumbs is thought to play a central role in establishing apical-basal polarity in epithelial cells of the fruitfly (Wodarz A, et al., (1995) Cell 82: 67-76). Recent work on crumbs (Klebes A, and Knust E (2000) Curr Biol 10: 76-85; den Hollander A I, et al., (1999) Nat Genet 23: 217-221) has shed new light on the question of how membrane domains are defined. Mutations in a human homologue of Drosophila crumbs cause retinitis pigmentosa (RP12) (den Hollander A I, et al., supra).

[0008] All members of the Crumbs protein family show a high degree of homology in the short intracellular (IC) region. In Drosophila this region is both necessary and sufficient for the establishment of apico-basal polarity (Klebes A and Knust E (2000) Current Biology 10:76-85). The IC domain contains essentially two conserved functional features, a FERM-binding domain that immediately follows the transmembrane region and a C-terminal PDZ-binding motif (ERLI). Both of these domains are essential for function (Klebes and Knust 2000, supra). The FERM-binding domain interacts with β-Spectrin and the ERM protein Moesin, while the ERLI motif binds to the PDZ domain of the proteins Discs lost and Stardust (Medina E et al. (2002) Journal of Cell Biology 2002, 158:941-951; Klebes and Knust, supra; Hong Y et al. (2001) Nature 414:634-638).

[0009] Overexpression of Crumbs results in a phenotype that is somewhat different from the loss of function phenotype. Similar to the loss of function phenotype, apico-basal polarity is lost in epidermal cells overexpressing either Crumbs full length or a membrane bound Crumbs IC. However, these cells arrange themselves in a disorganized multi-layered epithelium that is strikingly different from the normal columnar single layer organization of the epidermis (Klebes and Knust, supra). This phenotype is strikingly reminiscent to the arrangement of APC colon tumor cell lines that in culture form multi-layered epithelium in which cells show loss of apico-basal polarity and adopt a mesenchymal morphology. Remarkably, this phenotype can be reverted by inhibition of the wnt/β-catenin signaling pathway with a dominant negative form of TCF (T-Cell Factor) (Naishiro Y et al. (2001) Cancer Research 61:2751-2758).

[0010] The ability to manipulate and screen the genomes of model organisms such as Drosophila and zebrafish provides a powerful means to analyze biochemical processes that, due to significant evolutionary conservation of genes, pathways, and cellular processes, have direct relevance to more complex vertebrate organisms.

[0011] Short life cycles and powerful forward and reverse genetic tools available for both Zebrafish and Drosophila allow rapid identification of critical components of pathways controlling branching morphogenesis. Given the evolutionary conservation of gene sequences and molecular pathways, the human orthologs of model organism genes can be utilized to modulate branching morphogenesis pathways, including angiogenesis.

[0012] All references cited herein, including patents, patent applications, publications, and sequence information in referenced Genbank identifier numbers, are incorporated herein in their entireties.

SUMMARY OF THE INVENTION

[0013] We have discovered genes that modify branching morphogenesis in Drosophila, and identified their human orthologs, hereinafter referred to as CRUMBS (CRB). The invention provides methods for utilizing these branching morphogenesis modifier genes and polypeptides to identify CRB-modulating agents that are candidate therapeutic agents that can be used in the treatment of disorders associated with defective or impaired branching morphogenesis function and/or CRB function. Preferred CRB-modulating agents specifically bind to CRB polypeptides and restore branching morphogenesis function. Other preferred CRB-modulating agents are nucleic acid modulators such as antisense oligomers and RNAi that repress CRB gene expression or product activity by, for example, binding to and inhibiting the respective nucleic acid (i.e. DNA or mRNA).

[0014] CRB modulating agents may be evaluated by any convenient in vitro or in vivo assay for molecular interaction with a CRB polypeptide or nucleic acid. In one embodiment, candidate CRB modulating agents are tested with an assay system comprising a CRB polypeptide or nucleic acid. Agents that produce a change in the activity of the assay system relative to controls are identified as candidate branching morphogenesis modulating agents. The assay system may be cell-based or cell-free. CRB-modulating agents include CRB related proteins (e.g. dominant negative mutants, and biotherapeutics); CRB-specific antibodies; CRB-specific antisense oligomers and other nucleic acid modulators; and chemical agents that specifically bind to or interact with CRB or compete with CRB binding partner (e.g. by binding to a CRB binding partner). In one specific embodiment, a small molecule modulator is identified using a binding assay. In specific embodiments, the screening assay system is selected from an apoptosis assay, a cell proliferation assay, an angiogenesis assay, and a hypoxic induction assay.

[0015] In another embodiment of the invention, the assay system comprises cultured cells or a non-human animal expressing CRB, and the assay system detects an agent-biased change in branching morphogenesis, including angiogenesis. Events detected by cell-based assays include cell proliferation, cell cycling, apoptosis, tubulogenesis, cell migration, and response to hypoxic conditions. For assays that detect tubulogenesis or cell migration, the assay system may comprise the step of testing the cellular response to stimulation with at least two different pro-angiogenic agents. Alternatively, tubulogenesis or cell migration may be detected by stimulating cells with an inflammatory angiogenic agent. In specific embodiments, the animal-based assay is selected from a matrix implant assay, a xenograft assay, a hollow fiber assay, or a transgenic tumor assay.

[0016] In another embodiment, candidate branching morphogenesis modulating agents that have been identified in cell-free or cell-based assays are further tested using a second assay system that detects changes in an activity associated with branching morphogenesis. In a specific embodiment, the second assay detects an agent-biased change in an activity associated with angiogenesis. The second assay system may use cultured cells or non-human animals. In specific embodiments, the secondary assay system uses non-human animals, including animals predetermined to have a disease or disorder implicating branching morphogenesis, including increased or impaired angiogenesis or solid tumor metastasis.

[0017] The invention further provides methods for modulating the CRB function and/or branching morphogenesis in a mammalian cell by contacting the mammalian cell with an agent that specifically binds a CRB polypeptide or nucleic acid. The agent may be a small molecule modulator, a nucleic acid modulator, or an antibody and may be administered to a mammalian animal predetermined to have a pathology associated branching morphogenesis.

DETAILED DESCRIPTION OF THE INVENTION

[0018] In a Drosophila screen designed to identify genes associated with tracheal defects, we discovered that crumbs (Genbank Identifier [GI] 552087) modulates branching morphogenesis. Accordingly, vertebrate orthologs of these modifiers, and preferably the human orthologs, CRB genes (i.e., nucleic acids and polypeptides) are attractive drug targets for the treatment of pathologies associated with a defective branching morphogenesis signaling pathway, such as cancer.

[0019] In vitro and in vivo methods of assessing CRB function are provided herein. Modulation of the CRB or their respective binding partners is useful for understanding the association of branching morphogenesis and its members in normal and disease conditions and for developing diagnostics and therapeutic modalities for branching morphogenesis related pathologies. CRB-modulating agents that act by inhibiting or enhancing CRB expression, directly or indirectly, for example, by affecting a CRB function such as binding activity, can be identified using methods provided herein. CRB modulating agents are useful in diagnosis, therapy and pharmaceutical development.

[0020] As used herein, branching morphogenesis encompasses the numerous cellular processes involved in the formation of branched networks, including proliferation, survival/apoptosis, migration, invasion, adhesion, aggregation and matrix remodeling. As used herein, pathologies associated with branching morphogenesis encompass pathologies where branching morphogenesis contributes to maintaining the healthy state, as well as pathologies whose course may be altered by modulation of the branching morphogenesis.

[0021] Nucleic Acids and Polypeptides of the Invention

[0022] Sequences related to CRB nucleic acids and polypeptides that can be used in the invention are disclosed in Genbank (referenced by Genbank identifier (GI) number) as GI#s 6912321 (SEQ ID NO:1), 18175294 (SEQ ID NO:2), 21040240 (SEQ ID NO:4), 17390964 (SEQ ID NO:5), 18572383 (SEQ ID NO:11), 15877525 (SEQ ID NO:12), and 21755115 (SEQ ID NO:13) for nucleic acid, and GI#s 6912322 (SEQ ID NO:14), 18175295 (SEQ ID NO:15), and 21040241 (SEQ ID NO:16) for polypeptides. Additionally, nucleotide sequences of SEQ ID NOs: 3, 6, 7, 8, 9, and 10, and polypeptide sequence of SEQ ID NO:17 can be used in the invention.

[0023] The term “CRB polypeptide” refers to a full-length CRB protein or a functionally active fragment or derivative thereof. A “functionally active” CRB fragment or derivative exhibits one or more functional activities associated with a full-length, wild-type CRB protein, such as antigenic or immunogenic activity, ability to bind natural cellular substrates, etc. The functional activity of CRB proteins, derivatives and fragments can be assayed by various methods known to one skilled in the art (Current Protocols in Protein Science (1998) Coligan et al., eds., John Wiley & Sons, Inc., Somerset, N.J.) and as further discussed below. In one embodiment, a functionally active CRB polypeptide is a CRB derivative capable of rescuing defective endogenous CRB activity, such as in cell based or animal assays; the rescuing derivative may be from the same or a different species. For purposes herein, functionally active fragments also include those fragments that comprise one or more structural domains of a CRB, such as an EGF-like domain or a binding domain. Protein domains can be identified using the PFAM program (Bateman A., et al., Nucleic Acids Res, 1999, 27:260-2). For example, the EGF-like domain of CRB from GI# 18175295 (SEQ ID NO:15) is located at approximately amino acid residues 34 to 67, 74 to 107, 114 to 145, 152 to 183, 190 to 221, 228 to 259, 266 to 298, 305 to 336, 343 to 394, 401 to 438, 445 to 480, 676 to 707, 891 to 922, 1143 to 1174, 1181 to 1211, 1218 to 1249, 1259 to 1294, and 1301 to 1332 (PFAM PF00008). Methods for obtaining CRB polypeptides are also further described below. In some embodiments, preferred fragments are functionally active, domain-containing fragments comprising at least 25 contiguous amino acids, preferably at least 50, more preferably 75, and most preferably at least 100 contiguous amino acids of any one of SEQ ID NOs:14-17 (a CRB). In further preferred embodiments, the fragment comprises the entire functionally active domain.

[0024] The term “CRB nucleic acid” refers to a DNA or RNA molecule that encodes a CRB polypeptide. Preferably, the CRB polypeptide or nucleic acid or fragment thereof is from a human, but can also be an ortholog, or derivative thereof with at least 70% sequence identity, preferably at least 80%, more preferably 85%, still more preferably 90%, and most preferably at least 95% sequence identity with human CRB. Methods of identifying orthlogs are known in the art. Normally, orthologs in different species retain the same function, due to presence of one or more protein motifs and/or 3-dimensional structures. Orthologs are generally identified by sequence homology analysis, such as BLAST analysis, usually using protein bait sequences. Sequences are assigned as a potential ortholog if the best hit sequence from the forward BLAST result retrieves the original query sequence in the reverse BLAST (Huynen M A and Bork P, Proc Natl Acad Sci (1998) 95:5849-5856; Huynen M A et al., Genome Research (2000) 10:1204-1210). Programs for multiple sequence alignment, such as CLUSTAL (Thompson J D et al, 1994, Nucleic Acids Res 22:4673-4680) may be used to highlight conserved regions and/or residues of orthologous proteins and to generate phylogenetic trees. In a phylogenetic tree representing multiple homologous sequences from diverse species (e.g., retrieved through BLAST analysis), orthologous sequences from two species generally appear closest on the tree with respect to all other sequences from these two species. Structural threading or other analysis of protein folding (e.g., using software by ProCeryon, Biosciences, Salzburg, Austria) may also identify potential orthologs. In evolution, when a gene duplication event follows speciation, a single gene in one species, such as Drosophila, may correspond to multiple genes (paralogs) in another, such as human. As used herein, the term “orthologs” encompasses paralogs. As used herein, “percent (%) sequence identity” with respect to a subject sequence, or a specified portion of a subject sequence, is defined as the percentage of nucleotides or amino acids in the candidate derivative sequence identical with the nucleotides or amino acids in the subject sequence (or specified portion thereof), after aligning the sequences and introducing gaps, if necessary to achieve the maximum percent sequence identity, as generated by the program WU-BLAST-2.0a19 (Altschul et al., J. Mol. Biol. (1997) 215:403-410) with all the search parameters set to default values. The HSP S and HSP S2 parameters are dynamic values and are established by the program itself depending upon the composition of the particular sequence and composition of the particular database against which the sequence of interest is being searched. A % identity value is determined by the number of matching identical nucleotides or amino acids divided by the sequence length for which the percent identity is being reported. “Percent (%) amino acid sequence similarity” is determined by doing the same calculation as for determining % amino acid sequence identity, but including conservative amino acid substitutions in addition to identical amino acids in the computation.

[0025] A conservative amino acid substitution is one in which an amino acid is substituted for another amino acid having similar properties such that the folding or activity of the protein is not significantly affected. Aromatic amino acids that can be substituted for each other are phenylalanine, tryptophan, and tyrosine; interchangeable hydrophobic amino acids are leucine, isoleucine, methionine, and valine; interchangeable polar amino acids are glutamine and asparagine; interchangeable basic amino acids are arginine, lysine and histidine; interchangeable acidic amino acids are aspartic acid and glutamic acid; and interchangeable small amino acids are alanine, serine, threonine, cysteine and glycine.

[0026] Alternatively, an alignment for nucleic acid sequences is provided by the local homology algorithm of Smith and Waterman (Smith and Waterman, 1981, Advances in Applied Mathematics 2:482-489; database: European Bioinformatics Institute; Smith and Waterman, 1981, J. of Molec.Biol., 147:195-197; Nicholas et al., 1998, “A Tutorial on Searching Sequence Databases and Sequence Scoring Methods” (www.psc.edu) and references cited therein.; W. R. Pearson, 1991, Genomics 11:635-650). This algorithm can be applied to amino acid sequences by using the scoring matrix developed by Dayhoff (Dayhoff: Atlas of Protein Sequences and Structure, M. O. Dayhoff ed., 5 suppl. 3:353-358, National Biomedical Research Foundation, Washington, D.C., USA), and normalized by Gribskov (Gribskov 1986 Nucl. Acids Res. 14(6):6745-6763). The Smith-Waterman algorithm may be employed where default parameters are used for scoring (for example, gap open penalty of 12, gap extension penalty of two). From the data generated, the “Match” value reflects “sequence identity.”

[0027] Derivative nucleic acid molecules of the subject nucleic acid molecules include sequences that hybridize to the nucleic acid sequence of any of SEQ ID NOs:1-13. The stringency of hybridization can be controlled by temperature, ionic strength, pH, and the presence of denaturing agents such as formamide during hybridization and washing. Conditions routinely used are set out in readily available procedure texts (e.g., Current Protocol in Molecular Biology, Vol. 1, Chap. 2.10, John Wiley & Sons, Publishers (1994); Sambrook et al., Molecular Cloning, Cold Spring Harbor (1989)). In some embodiments, a nucleic acid molecule of the invention is capable of hybridizing to a nucleic acid molecule containing the nucleotide sequence of any one of SEQ ID NOs:1-13 under high stringency hybridization conditions that are: prehybridization of filters containing nucleic acid for 8 hours to overnight at 65° C. in a solution comprising 6×single strength citrate (SSC) (1×SSC is 0.15 M NaCl, 0.015 M Na citrate; pH 7.0), 5×Denhardt's solution, 0.05% sodium pyrophosphate and 100 μg/ml herring sperm DNA; hybridization for 18-20 hours at 65° C. in a solution containing 6×SSC, 1×Denhardt's solution, 100 μg/ml yeast tRNA and 0.05% sodium pyrophosphate; and washing of filters at 65° C. for 1 h in a solution containing 0.1×SSC and 0.1% SDS (sodium dodecyl sulfate).

[0028] In other embodiments, moderately stringent hybridization conditions are used that comprise: pretreatment of filters containing nucleic acid for 6 h at 40° C. in a solution containing 35% formamide, 5×SSC, 50 mM Tris-HCl (pH7.5), 5 mM EDTA, 0.1% PVP, 0.1% Ficoll, 1% BSA, and 500 μg/ml denatured salmon sperm DNA; hybridization for 18-20 h at 40° C. in a solution containing 35% formamide, 5×SSC, 50 mM Tris-HCl (pH7.5), 5 mM EDTA, 0.02% PVP, 0.02% Ficoll, 0.2% BSA, 100 μg/ml salmon sperm DNA, and 10% (wt/vol) dextran sulfate; followed by washing twice for 1 hour at 55° C. in a solution containing 2×SSC and 0.1% SDS.

[0029] Alternatively, low stringency conditions can be used that comprise: incubation for 8 hours to overnight at 37° C. in a solution comprising 20% formamide, 5×SSC, 50 mM sodium phosphate (pH 7.6), 5×Denhardt's solution, 10% dextran sulfate, and 20 μg/ml denatured sheared salmon sperm DNA; hybridization in the same buffer for 18 to 20 hours; and washing of filters in 1×SSC at about 37° C. for 1 hour.

[0030] Isolation, Production, Expression, and Mis-expression of CRB Nucleic Acids and Polypeptides

[0031] CRB nucleic acids and polypeptides, useful for identifying and testing agents that modulate CRB function and for other applications related to the involvement of CRB in branching morphogenesis. CRB nucleic acids and derivatives and orthologs thereof may be obtained using any available method. For instance, techniques for isolating cDNA or genomic DNA sequences of interest by screening DNA libraries or by using polymerase chain reaction (PCR) are well known in the art. In general, the particular use for the protein will dictate the particulars of expression, production, and purification methods. For instance, production of proteins for use in screening for modulating agents may require methods that preserve specific biological activities of these proteins, whereas production of proteins for antibody generation may require structural integrity of particular epitopes. Expression of proteins to be purified for screening or antibody production may require the addition of specific tags (e.g., generation of fusion proteins). Overexpression of a CRB protein for assays used to assess CRB function, such as involvement in cell cycle regulation or hypoxic response, may require expression in eukaryotic cell lines capable of these cellular activities. Techniques for the expression, production, and purification of proteins are well known in the art; any suitable means therefore may be used (e.g., Higgins S J and Hames B D (eds.) Protein Expression: A Practical Approach, Oxford University Press Inc., New York 1999; Stanbury P F et al., Principles of Fermentation Technology, 2^(nd) edition, Elsevier Science, New York, 1995; Doonan S (ed.) Protein Purification Protocols, Humana Press, New Jersey, 1996; Coligan J E et al, Current Protocols in Protein Science (eds.), 1999, John Wiley & Sons, New York).

[0032] The nucleotide sequence encoding a CRB polypeptide can be inserted into any appropriate expression vector. The necessary transcriptional and translational signals, including promoter/enhancer element, can derive from the native CRB gene and/or its flanking regions or can be heterologous. A variety of host-vector expression systems may be utilized, such as mammalian cell systems infected with virus (e.g. vaccinia virus, adenovirus, etc.); insect cell systems infected with virus (e.g. baculovirus); microorganisms such as yeast containing yeast vectors, or bacteria transformed with bacteriophage, plasmid, or cosmid DNA. An isolated host cell strain that modulates the expression of, modifies, and/or specifically processes the gene product may be used.

[0033] To detect expression of the CRB gene product, the expression vector can comprise a promoter operably linked to a CRB gene nucleic acid, one or more origins of replication, and, one or more selectable markers (e.g. thymidine kinase activity, resistance to antibiotics, etc.). Alternatively, recombinant expression vectors can be identified by assaying for the expression of the CRB gene product based on the physical or functional properties of the CRB protein in in vitro assay systems (e.g. immunoassays).

[0034] The CRB protein, fragment, or derivative may be optionally expressed as a fusion, or chimeric protein product (i.e. it is joined via a peptide bond to a heterologous protein sequence of a different protein), for example to facilitate purification or detection. A chimeric product can be made by ligating the appropriate nucleic acid sequences encoding the desired amino acid sequences to each other using standard methods and expressing the chimeric product. A chimeric product may also be made by protein synthetic techniques, e.g. by use of a peptide synthesizer (Hunkapiller et al., Nature (1984) 310:105-111).

[0035] Once a recombinant cell that expresses the CRB gene sequence is identified, the gene product can be isolated and purified using standard methods (e.g. ion exchange, affinity, and gel exclusion chromatography; centrifugation; differential solubility; electrophoresis). Alternatively, native CRB proteins can be purified from natural sources, by standard methods (e.g. immunoaffinity purification). Once a protein is obtained, it may be quantified and its activity measured by appropriate methods, such as immunoassay, bioassay, or other measurements of physical properties, such as crystallography.

[0036] The methods of this invention may also use cells that have been engineered for altered expression (mis-expression) of CRB or other genes associated with branching morphogenesis. As used herein, mis-expression encompasses ectopic expression, over-expression, under-expression, and non-expression (e.g. by gene knock-out or blocking expression that would otherwise normally occur).

[0037] Genetically Modified Animals

[0038] Animal models that have been genetically modified to alter CRB expression may be used in in vivo assays to test for activity of a candidate branching morphogenesis modulating agent, or to further assess the role of CRB in a branching morphogenesis process such as apoptosis or cell proliferation. Preferably, the altered CRB expression results in a detectable phenotype, such as decreased or increased levels of cell proliferation, angiogenesis, or apoptosis compared to control animals having normal CRB expression. The genetically modified animal may additionally have defective branching morphogenesis. Preferred genetically modified animals are mammals such as primates, rodents (preferably mice or rats), among others. Preferred non-mammalian species include zebrafish, C. elegans, and Drosophila. Preferred genetically modified animals are transgenic animals having a heterologous nucleic acid sequence present as an extrachromosomal element in a portion of its cells, i.e. mosaic animals (see, for example, techniques described by Jakobovits, 1994, Curr. Biol. 4:761-763.) or stably integrated into its germ line DNA (i.e., in the genomic sequence of most or all of its cells). Heterologous nucleic acid is introduced into the germ line of such transgenic animals by genetic manipulation of, for example, embryos or embryonic stem cells of the host animal.

[0039] Methods of making transgenic animals are well-known in the art (for transgenic mice see Brinster et al., Proc. Nat. Acad. Sci. USA 82: 4438-4442 (1985), U.S. Pat. Nos. 4,736,866 and 4,870,009, both by Leder et al., U.S. Pat. No. 4,873,191 by Wagner et al., and Hogan, B., Manipulating the Mouse Embryo, Cold Spring Harbor Laboratory Press, Cold Spring Harbor, N.Y., (1986); for particle bombardment see U.S. Pat. No., 4,945,050, by Sandford et al.; for transgenic Drosophila see Rubin and Spradling, Science (1982) 218:348-53 and U.S. Pat. No. 4,670,388; for transgenic insects see Berghammer A. J. et al., A Universal Marker for Transgenic Insects (1999) Nature 402:370-371; for transgenic Zebrafish see Lin S., Transgenic Zebrafish, Methods Mol Biol. (2000);136:375-3830); for microinjection procedures for fish, amphibian eggs and birds see Houdebine and Chourrout, Experientia (1991) 47:897-905; for transgenic rats see Hammer et al., Cell (1990) 63:1099-1112; and for culturing of embryonic stem (ES) cells and the subsequent production of transgenic animals by the introduction of DNA into ES cells using methods such as electroporation, calcium phosphate/DNA precipitation and direct injection see, e.g., Teratocarcinomas and Embryonic Stem Cells, A Practical Approach, E. J. Robertson, ed., IRL Press (1987)). Clones of the nonhuman transgenic animals can be produced according to available methods (see Wilmut, I. et al. (1997) Nature 385:810-813; and PCT International Publication Nos. WO 97/07668 and WO 97/07669).

[0040] In one embodiment, the transgenic animal is a “knock-out” animal having a heterozygous or homozygous alteration in the sequence of an endogenous CRB gene that results in a decrease of CRB function, preferably such that CRB expression is undetectable or insignificant. Knock-out animals are typically generated by homologous recombination with a vector comprising a transgene having at least a portion of the gene to be knocked out. Typically a deletion, addition or substitution has been introduced into the transgene to functionally disrupt it. The transgene can be a human gene (e.g., from a human genomic clone) but more preferably is an ortholog of the human gene derived from the transgenic host species. For example, a mouse CRB gene is used to construct a homologous recombination vector suitable for altering an endogenous CRB gene in the mouse genome. Detailed methodologies for homologous recombination in mice are available (see Capecchi, Science (1989) 244:1288-1292; Joyner et al., Nature (1989) 338:153-156). Procedures for the production of non-rodent transgenic mammals and other animals are also available (Houdebine and Chourrout, supra; Pursel et al., Science (1989) 244:1281-1288; Simms et al., Bio/Technology (1988) 6:179-183). In a preferred embodiment, knock-out animals, such as mice harboring a knockout of a specific gene, may be used to produce antibodies against the human counterpart of the gene that has been knocked out (Claesson M H et al., (1994) Scan J Immunol 40:257-264; Declerck P J et al., (1995) J Biol Chem. 270:8397-400).

[0041] In another embodiment, the transgenic animal is a “knock-in” animal having an alteration in its genome that results in altered expression (e.g., increased (including ectopic) or decreased expression) of the CRB gene, e.g., by introduction of additional copies of CRB, or by operatively inserting a regulatory sequence that provides for altered expression of an endogenous copy of the CRB gene. Such regulatory sequences include inducible, tissue-specific, and constitutive promoters and enhancer elements. The knock-in can be homozygous or heterozygous.

[0042] Transgenic nonhuman animals can also be produced that contain selected systems allowing for regulated expression of the transgene. One example of such a system that may be produced is the cre/loxP recombinase system of bacteriophage P1 (Lakso et al., PNAS (1992) 89:6232-6236; U.S. Pat. No. 4,959,317). If a cre/loxP recombinase system is used to regulate expression of the transgene, animals containing transgenes encoding both the Cre recombinase and a selected protein are required. Such animals can be provided through the construction of “double” transgenic animals, e.g., by mating two transgenic animals, one containing a transgene encoding a selected protein and the other containing a transgene encoding a recombinase. Another example of a recombinase system is the FLP recombinase system of Saccharomyces cerevisiae (O'Gorman et al. (1991) Science 251:1351-1355; U.S. Pat. No. 5,654,182). In a preferred embodiment, both Cre-LoxP and Flp-Frt are used in the same system to regulate expression of the transgene, and for sequential deletion of vector sequences in the same cell (Sun X et al (2000) Nat Genet 25:83-6).

[0043] The genetically modified animals can be used in genetic studies to further elucidate branching morphogenesis, as animal models of disease and disorders implicating defective branching morphogenesis function, and for in vivo testing of candidate therapeutic agents, such as those identified in screens described below. The candidate therapeutic agents are administered to a genetically modified animal having altered CRB function and phenotypic changes are compared with appropriate control animals such as genetically modified animals that receive placebo treatment, and/or animals with unaltered CRB expression that receive candidate therapeutic agent.

[0044] In addition to the above-described genetically modified animals having altered CRB function, animal models having defective branching morphogenesis function (and otherwise normal CRB function), can be used in the methods of the present invention. Preferably, the candidate branching morphogenesis modulating agent when administered to a model system with cells defective in branching morphogenesis function, produces a detectable phenotypic change in the model system indicating that the branching morphogenesis function is restored.

[0045] Modulating Agents

[0046] The invention provides methods to identify agents that interact with and/or modulate the function of CRB and/or branching morphogenesis. Modulating agents identified by the methods are also part of the invention. Such agents are useful in a variety of diagnostic and therapeutic applications associated with branching morphogenesis, as well as in further analysis of the CRB protein and its contribution to branching morphogenesis. Accordingly, the invention also provides methods for modulating branching morphogenesis comprising the step of specifically modulating CRB activity by administering a CRB-interacting or -modulating agent.

[0047] As used herein, a “CRB-modulating agent” is any agent that modulates CRB function, for example, an agent that interacts with CRB to inhibit or enhance CRB activity or otherwise affect normal CRB function. CRB function can be affected at any level, including transcription, protein expression, protein localization, and cellular or extra-cellular activity. In a preferred embodiment, the CRB-modulating agent specifically modulates the function of the CRB. The phrases “specific modulating agent”, “specifically modulates”, etc., are used herein to refer to modulating agents that directly bind to the CRB polypeptide or nucleic acid, and preferably inhibit, enhance, or otherwise alter, the function of the CRB. These phrases also encompasses modulating agents that alter the interaction of the CRB with a binding partner, substrate, or cofactor (e.g. by binding to a binding partner of a CRB, or to a protein/binding partner complex, and altering CRB function). In a further preferred embodiment, the CRB-modulating agent is a modulator of branching morphogenesis (e.g. it restores and/or upregulates branching morphogenesis) and thus is also a branching morphogenesis-modulating agent.

[0048] Preferred CRB-modulating agents include small molecule compounds; CRB-interacting proteins, including antibodies and other biotherapeutics; and nucleic acid modulators such as antisense and RNA inhibitors. The modulating agents may be formulated in pharmaceutical compositions, for example, as compositions that may comprise other active ingredients, as in combination therapy, and/or suitable carriers or excipients. Techniques for formulation and administration of the compounds may be found in “Remington's Pharmaceutical Sciences” Mack Publishing Co., Easton, Pa., 19^(th) edition.

[0049] Small Molecule Modulators

[0050] Small molecules are often preferred to modulate function of proteins with enzymatic function, and/or containing protein interaction domains. Chemical agents, referred to in the art as “small molecule” compounds are typically organic, non-peptide molecules, having a molecular weight less than 10,000, preferably less than 5,000, more preferably less than 1,000, and most preferably less than 500. This class of modulators includes chemically synthesized molecules, for instance, compounds from combinatorial chemical libraries. Synthetic compounds may be rationally designed or identified based on known or inferred properties of the CRB protein or may be identified by screening compound libraries. Alternative appropriate modulators of this class are natural products, particularly secondary metabolites from organisms such as plants or fungi, which can also be identified by screening compound libraries for CRB-modulating activity. Methods for generating and obtaining compounds are well known in the art (Schreiber S L, Science (2000) 151: 1964-1969; Radmann J and Gunther J, Science (2000) 151:1947-1948).

[0051] Small molecule modulators identified from screening assays, as described below, can be used as lead compounds from which candidate clinical compounds may be designed, optimized, and synthesized. Such clinical compounds may have utility in treating pathologies associated with branching morphogenesis. The activity of candidate small molecule modulating agents may be improved several-fold through iterative secondary functional validation, as further described below, structure determination, and candidate modulator modification and testing. Additionally, candidate clinical compounds are generated with specific regard to clinical and pharmacological properties. For example, the reagents may be derivatized and re-screened using in vitro and in vivo assays to optimize activity and minimize toxicity for pharmaceutical development.

[0052] Protein Modulators

[0053] Specific CRB-interacting proteins are useful in a variety of diagnostic and therapeutic applications related to branching morphogenesis and related disorders, as well as in validation assays for other CRB-modulating agents. In a preferred embodiment, CRB-interacting proteins affect normal CRB function, including transcription, protein expression, protein localization, and cellular or extra-cellular activity. In another embodiment, CRB-interacting proteins are useful in detecting and providing information about the function of CRB proteins, as is relevant to branching morphogenesis related disorders, such as cancer (e.g., for diagnostic means).

[0054] A CRB-interacting protein may be endogenous, i.e. one that naturally interacts genetically or biochemically with a CRB, such as a member of the CRB pathway that modulates CRB expression, localization, and/or activity. CRB-modulators include dominant negative forms of CRB-interacting proteins and of CRB proteins themselves. Yeast two-hybrid and variant screens offer preferred methods for identifying endogenous CRB-interacting proteins (Finley, R. L. et al. (1996) in DNA Cloning-Expression Systems: A Practical Approach, eds. Glover D. & Hames B. D (Oxford University Press, Oxford, England), pp. 169-203; Fashema S F et al., Gene (2000) 250:1-14; Drees B L Curr Opin Chem Biol (1999) 3:64-70; Vidal M and Legrain P Nucleic Acids Res (1999) 27:919-29; and U.S. Pat. No. 5,928,868). Mass spectrometry is an alternative preferred method for the elucidation of protein complexes (reviewed in, e.g., Pandley A and Mann M, Nature (2000) 405:837-846; Yates J R 3^(rd), Trends Genet (2000) 16:5-8).

[0055] A CRB-interacting protein may be an exogenous protein, such as a CRB-specific antibody or a T-cell antigen receptor (see, e.g., Harlow and Lane (1988) Antibodies, A Laboratory Manual, Cold Spring Harbor Laboratory; Harlow and Lane (1999) Using antibodies: a laboratory manual. Cold Spring Harbor, N.Y.: Cold Spring Harbor Laboratory Press). CRB antibodies are further discussed below.

[0056] In preferred embodiments, a CRB-interacting protein specifically binds a CRB protein. In alternative preferred embodiments, a CRB-modulating agent binds a CRB substrate, binding partner, or cofactor.

[0057] Antibodies

[0058] In another embodiment, the protein modulator is a CRB specific antibody agonist or antagonist. The antibodies have therapeutic and diagnostic utilities, and can be used in screening assays to identify CRB modulators. The antibodies can also be used in dissecting the portions of the CRB pathway responsible for various cellular responses and in the general processing and maturation of the CRB.

[0059] Antibodies that specifically bind CRB polypeptides can be generated using known methods. Preferably the antibody is specific to a mammalian ortholog of CRB polypeptide, and more preferably, to human CRB. Antibodies may be polyclonal, monoclonal (mAbs), humanized or chimeric antibodies, single chain antibodies, Fab fragments, F(ab′).sub.2 fragments, fragments produced by a FAb expression library, anti-idiotypic (anti-Id) antibodies, and epitope-binding fragments of any of the above. Epitopes of CRB which are particularly antigenic can be selected, for example, by routine screening of CRB polypeptides for antigenicity or by applying a theoretical method for selecting antigenic regions of a protein (Hopp and Wood (1981), Proc. Nati. Acad. Sci. U.S.A. 78:3824-28; Hopp and Wood, (1983) Mol. Immunol. 20:483-89; Sutcliffe et al., (1983) Science 219:660-66) to the amino acid sequence shown in any of SEQ ID NOs:14-17. Monoclonal antibodies with affinities of 10⁸ M⁻¹ preferably 10⁹ M⁻¹ to 10¹⁰ M⁻¹, or stronger can be made by standard procedures as described (Harlow and Lane, supra; Goding (1986) Monoclonal Antibodies: Principles and Practice (2d ed) Academic Press, New York; and U.S. Pat. Nos. 4,381,292; 4,451,570; and 4,618,577). Antibodies may be generated against crude cell extracts of CRB or substantially purified fragments thereof. If CRB fragments are used, they preferably comprise at least 10, and more preferably, at least 20 contiguous amino acids of a CRB protein. In a particular embodiment, CRB-specific antigens and/or immunogens are coupled to carrier proteins that stimulate the immune response. For example, the subject polypeptides are covalently coupled to the keyhole limpet hemocyanin (KLH) carrier, and the conjugate is emulsified in Freund's complete adjuvant, which enhances the immune response. An appropriate immune system such as a laboratory rabbit or mouse is immunized according to conventional protocols.

[0060] The presence of CRB-specific antibodies is assayed by an appropriate assay such as a solid phase enzyme-linked immunosorbant assay (ELISA) using immobilized corresponding CRB polypeptides. Other assays, such as radioimmunoassays or fluorescent assays might also be used.

[0061] Chimeric antibodies specific to CRB polypeptides can be made that contain different portions from different animal species. For instance, a human immunoglobulin constant region may be linked to a variable region of a murine mAb, such that the antibody derives its biological activity from the human antibody, and its binding specificity from the murine fragment. Chimeric antibodies are produced by splicing together genes that encode the appropriate regions from each species (Morrison et al., Proc. Natl. Acad. Sci. (1984) 81:6851-6855; Neuberger et al., Nature (1984) 312:604-608; Takeda et al., Nature (1985) 31:452-454). Humanized antibodies, which are a form of chimeric antibodies, can be generated by grafting complementary-determining regions (CDRs) (Carlos, T. M., J. M. Harlan. 1994. Blood 84:2068-2101) of mouse antibodies into a background of human framework regions and constant regions by recombinant DNA technology (Riechmann L M, et al., 1988 Nature 323: 323-327). Humanized antibodies contain ˜10% murine sequences and ˜90% human sequences, and thus further reduce or eliminate immunogenicity, while retaining the antibody specificities (Co M S, and Queen C. 1991 Nature 351: 501-501; Morrison S L. 1992 Ann. Rev. Immun. 10:239-265). Humanized antibodies and methods of their production are well-known in the art (U.S. Pat. Nos. 5,530,101, 5,585,089, 5,693,762, and 6,180,370).

[0062] CRB-specific single chain antibodies which are recombinant, single chain polypeptides formed by linking the heavy and light chain fragments of the Fv regions via an amino acid bridge, can be produced by methods known in the art (U.S. Pat. No. 4,946,778; Bird, Science (1988) 242:423-426; Huston et al., Proc. Natl. Acad. Sci. USA (1988) 85:5879-5883; and Ward et al., Nature (1989) 334:544-546).

[0063] Other suitable techniques for antibody production involve in vitro exposure of lymphocytes to the antigenic polypeptides or alternatively to selection of libraries of antibodies in phage or similar vectors (Huse et al., Science (1989) 246:1275-1281). As used herein, T-cell antigen receptors are included within the scope of antibody modulators (Harlow and Lane, 1988, supra).

[0064] The polypeptides and antibodies of the present invention may be used with or without modification. Frequently, antibodies will be labeled by joining, either covalently or non-covalently, a substance that provides for a detectable signal, or that is toxic to cells that express the targeted protein (Menard S, et al., Int J. Biol Markers (1989) 4:131-134). A wide variety of labels and conjugation techniques are known and are reported extensively in both the scientific and patent literature. Suitable labels include radionuclides, enzymes, substrates, cofactors, inhibitors, fluorescent moieties, fluorescent emitting lanthanide metals, chemiluminescent moieties, bioluminescent moieties, magnetic particles, and the like (U.S. Pat. Nos. 3,817,837; 3,850,752; 3,939,350; 3,996,345; 4,277,437; 4,275,149; and 4,366,241). Also, recombinant immunoglobulins may be produced (U.S. Pat. No. 4,816,567). Antibodies to cytoplasmic polypeptides may be delivered and reach their targets by conjugation with membrane-penetrating toxin proteins (U.S. Pat. No. 6,086,900).

[0065] When used therapeutically in a patient, the antibodies of the subject invention are typically administered parenterally, when possible at the target site, or intravenously. The therapeutically effective dose and dosage regimen is determined by clinical studies. Typically, the amount of antibody administered is in the range of about 0.1 mg/kg-to about 10 mg/kg of patient weight. For parenteral administration, the antibodies are formulated in a unit dosage injectable form (e.g., solution, suspension, emulsion) in association with a pharmaceutically acceptable vehicle. Such vehicles are inherently nontoxic and non-therapeutic. Examples are water, saline, Ringer's solution, dextrose solution, and 5% human serum albumin. Nonaqueous vehicles such as fixed oils, ethyl oleate, or liposome carriers may also be used. The vehicle may contain minor amounts of additives, such as buffers and preservatives, which enhance isotonicity and chemical stability or otherwise enhance therapeutic potential. The antibodies' concentrations in such vehicles are typically in the range of about 1 mg/ml to about 10 mg/ml. Immunotherapeutic methods are further described in the literature (U.S. Pat. No. 5,859,206; WO0073469).

[0066] Specific Biotherapeutics

[0067] In a preferred embodiment, a CRB-interacting protein may have biotherapeutic applications. Biotherapeutic agents formulated in pharmaceutically acceptable carriers and dosages may be used to activate or inhibit signal transduction pathways. This modulation may be accomplished by binding a ligand, thus inhibiting the activity of the pathway; or by binding a receptor, either to inhibit activation of, or to activate, the receptor. Alternatively, the biotherapeutic may itself be a ligand capable of activating or inhibiting a receptor. Biotherapeutic agents and methods of producing them are described in detail in U.S. Pat. No. 6,146,628.

[0068] Since CRB is a receptor, its ligand(s), antibodies to the ligand(s) or the CRB itself may be used as biotherapeutics to modulate the activity of CRB in branching morphogenesis.

[0069] Nucleic Acid Modulators

[0070] Other preferred CRB-modulating agents comprise nucleic acid molecules, such as antisense oligomers or double stranded RNA (dsRNA), which generally inhibit CRB activity. Preferred nucleic acid modulators interfere with the function of the CRB nucleic acid such as DNA replication, transcription, translocation of the CRB RNA to the site of protein translation, translation of protein from the CRB RNA, splicing of the CRB RNA to yield one or more mRNA species, or catalytic activity which may be engaged in or facilitated by the CRB RNA.

[0071] In one embodiment, the antisense oligomer is an oligonucleotide that is sufficiently complementary to a CRB mRNA to bind to and prevent translation, preferably by binding to the 5′ untranslated region. CRB-specific antisense oligonucleotides, preferably range from at least 6 to about 200 nucleotides. In some embodiments the oligonucleotide is preferably at least 10, 15, or 20 nucleotides in length. In other embodiments, the oligonucleotide is preferably less than 50, 40, or 30 nucleotides in length. The oligonucleotide can be DNA or RNA or a chimeric mixture or derivatives or modified versions thereof, single-stranded or double-stranded. The oligonucleotide can be modified at the base moiety, sugar moiety, or phosphate backbone. The oligonucleotide may include other appending groups such as peptides, agents that facilitate transport across the cell membrane, hybridization-triggered cleavage agents, and intercalating agents.

[0072] In another embodiment, the antisense oligomer is a phosphothioate morpholino oligomer (PMO). PMOs are assembled from four different morpholino subunits, each of which contain one of four genetic bases (A, C, G, or T) linked to a six-membered morpholine ring. Polymers of these subunits are joined by non-ionic phosphodiamidate intersubunit linkages. Details of how to make and use PMOs and other antisense oligomers are well known in the art (e.g. see WO99/18193; Probst J C, Antisense Oligodeoxynucleotide and Ribozyme Design, Methods. (2000) 22(3):271-281; Summerton J, and Weller D. 1997 Antisense Nucleic Acid Drug Dev.:7:187-95; U.S. Pat. No. 5,235,033; and U.S. Pat. No. 5,378,841).

[0073] Alternative preferred CRB nucleic acid modulators are double-stranded RNA species mediating RNA interference (RNAi). RNAi is the process of sequence-specific, post-transcriptional gene silencing in animals and plants, initiated by double-stranded RNA (dsRNA) that is homologous in sequence to the silenced gene. Methods relating to the use of RNAi to silence genes in C. elegans, Drosophila, plants, and humans are known in the art (Fire A, et al., 1998 Nature 391:806-811; Fire, A. Trends Genet. 15, 358-363 (1999); Sharp, P. A. RNA interference 2001. Genes Dev. 15, 485-490 (2001); Hammond, S. M., et al., Nature Rev. Genet. 2, 110-1119 (2001); Tuschl, T. Chem. Biochem. 2, 239-245 (2001); Hamilton, A. et al., Science 286, 950-952 (1999); Hammond, S. M., et al., Nature 404, 293-296 (2000); Zamore, P. D., et al., Cell 101, 25-33 (2000); Bernstein, E., et al., Nature 409, 363-366 (2001); Elbashir, S. M., et al., Genes Dev. 15, 188-200 (2001); WO0129058; WO9932619; Elbashir S M, et al., 2001 Nature 411:494-498).

[0074] Nucleic acid modulators are commonly used as research reagents, diagnostics, and therapeutics. For example, antisense oligonucleotides, which are able to inhibit gene expression with exquisite specificity, are often used to elucidate the function of particular genes (see, for example, U.S. Pat. No. 6,165,790). Nucleic acid modulators are also used, for example, to distinguish between functions of various members of a biological pathway. For example, antisense oligomers have been employed as therapeutic moieties in the treatment of disease states in animals and man and have been demonstrated in numerous clinical trials to be safe and effective (Milligan J F, et al, Current Concepts in Antisense Drug Design, J Med Chem. (1993) 36:1923-1937; Tonkinson J L et al., Antisense Oligodeoxynucleotides as Clinical Therapeutic Agents, Cancer Invest. (1996) 14:54-65). Accordingly, in one aspect of the invention, a CRB-specific nucleic acid modulator is used in an assay to further elucidate the role of the CRB in branching morphogenesis, and/or its relationship to other members of the pathway. In another aspect of the invention, a CRB-specific antisense oligomer is used as a therapeutic agent for treatment of branching morphogenesis-related disease states.

[0075] Zebrafish is a particularly useful model for the study of branching morphogenesis using antisense oligomers. For example, PMOs are used to selectively inactive one or more genes in vivo in the Zebrafish embryo. By injecting PMOs into Zebrafish at the 1-16 cell stage candidate targets emerging from the Drosophila screens are validated in this vertebrate model system. In another aspect of the invention, PMOs are used to screen the Zebrafish genome for identification of other therapeutic modulators of branching morphogenesis. In a further aspect of the invention, a CRB-specific antisense oligomer is used as a therapeutic agent for treatment of pathologies associated with branching morphogenesis.

[0076] Assay Systems

[0077] The invention provides assay systems and screening methods for identifying specific modulators of CRB activity. As used herein, an “assay system” encompasses all the components required for performing and analyzing results of an assay that detects and/or measures a particular event. In general, primary assays are used to identify or confirm a modulator's specific biochemical or molecular effect with respect to the CRB nucleic acid or protein. In general, secondary assays further assess the activity of a CRB modulating agent identified by a primary assay and may confirm that the modulating agent affects CRB in a manner relevant to branching morphogenesis. In some cases, CRB modulators will be directly tested in a secondary assay.

[0078] In a preferred embodiment, the screening method comprises contacting a suitable assay system comprising a CRB polypeptide or nucleic acid with a candidate agent under conditions whereby, but for the presence of the agent, the system provides a reference activity (e.g. binding activity), which is based on the particular molecular event the screening method detects. A statistically significant difference between the agent-biased activity and the reference activity indicates that the candidate agent modulates CRB activity, and hence branching morphogenesis. The CRB polypeptide or nucleic acid used in the assay may comprise any of the nucleic acids or polypeptides described above.

[0079] Primary Assays

[0080] The type of modulator tested generally determines the type of primary assay.

[0081] Primary Assays for Small Molecule Modulators

[0082] For small molecule modulators, screening assays are used to identify candidate modulators. Screening assays may be cell-based or may use a cell-free system that recreates or retains the relevant biochemical reaction of the target protein (reviewed in Sittampalam G S et al., Curr Opin Chem Biol (1997) 1:384-91 and accompanying references). As used herein the term “cell-based” refers to assays using live cells, dead cells, or a particular cellular fraction, such as a membrane, endoplasmic reticulum, or mitochondrial fraction. The term “cell free” encompasses assays using substantially purified protein (either endogenous or recombinantly produced), partially purified or crude cellular extracts. Screening assays may detect a variety of molecular events, including protein-DNA interactions, protein-protein interactions (e.g., receptor-ligand binding), transcriptional activity (e.g., using a reporter gene), enzymatic activity (e.g., via a property of the substrate), activity of second messengers, immunogenicty and changes in cellular morphology or other cellular characteristics. Appropriate screening assays may use a wide range of detection methods including fluorescent, radioactive, colorimetric, spectrophotometric, and amperometric methods, to provide a read-out for the particular molecular event detected.

[0083] Cell-based screening assays usually require systems for recombinant expression of CRB and any auxiliary proteins demanded by the particular assay. Appropriate methods for generating recombinant proteins produce sufficient quantities of proteins that retain their relevant biological activities and are of sufficient purity to optimize activity and assure assay reproducibility. Yeast two-hybrid and variant screens, and mass spectrometry provide preferred methods for determining protein-protein interactions and elucidation of protein complexes. In certain applications, when CRB-interacting proteins are used in screens to identify small molecule modulators, the binding specificity of the interacting protein to the CRB protein may be assayed by various known methods such as substrate processing (e.g. ability of the candidate CRB-specific binding agents to function as negative effectors in CRB-expressing cells), binding equilibrium constants (usually at least about 10⁷ M⁻¹, preferably at least about 10⁸ M⁻¹, more preferably at least about 10⁹ M⁻¹), and immunogenicity (e.g. ability to elicit CRB specific antibody in a heterologous host such as a mouse, rat, goat or rabbit). For enzymes and receptors, binding may be assayed by, respectively, substrate and ligand processing.

[0084] The screening assay may measure a candidate agent's ability to specifically bind to or modulate activity of a CRB polypeptide, a fusion protein thereof, or to cells or membranes bearing the polypeptide or fusion protein. The CRB polypeptide can be full length or a fragment thereof that retains functional CRB activity. The CRB polypeptide may be fused to another polypeptide, such as a peptide tag for detection or anchoring, or to another tag. The CRB polypeptide is preferably human CRB, or is an ortholog or derivative thereof as described above. In a preferred embodiment, the screening assay detects candidate agent-based modulation of CRB interaction with a binding target, such as an endogenous or exogenous protein or other substrate that has CRB-specific binding activity, and can be used to assess normal CRB gene function.

[0085] Suitable assay formats that may be adapted to screen for CRB modulators are known in the art. Preferred screening assays are high throughput or ultra high throughput and thus provide automated, cost-effective means of screening compound libraries for lead compounds (Fernandes P B, Curr Opin Chem Biol (1998) 2:597-603; Sundberg S A, Curr Opin Biotechnol 2000, 11:47-53). In one preferred embodiment, screening assays uses fluorescence technologies, including fluorescence polarization, time-resolved fluorescence, and fluorescence resonance energy transfer. These systems offer means to monitor protein-protein or DNA-protein interactions in which the intensity of the signal emitted from dye-labeled molecules depends upon their interactions with partner molecules (e.g., Selvin P R, Nat Struct Biol (2000) 7:730-4; Fernandes P B, supra; Hertzberg R P and Pope A J, Curr Opin Chem Biol (2000) 4:445-451).

[0086] A variety of suitable assay systems may be used to identify candidate CRB and branching morphogenesis modulators (e.g. U.S. Pat. Nos. 5,550,019 and 6,133,437 (apoptosis assays); U.S. Pat. Nos. 5,976,782, 6,225,118 and 6,444,434 (angiogenesis assays), among others). Specific preferred assays are described in more detail below.

[0087] Apoptosis assays. Assays for apoptosis may be performed by terminal deoxynucleotidyl transferase-mediated digoxigenin-11-dUTP nick end labeling (TUNEL) assay. The TUNEL assay is used to measure nuclear DNA fragmentation characteristic of apoptosis (Lazebnik et al., 1994, Nature 371, 346), by following the incorporation of fluorescein-dUTP (Yonehara et al., 1989, J. Exp. Med. 169, 1747). Apoptosis may further be assayed by acridine orange staining of tissue culture cells (Lucas, R., et al., 1998, Blood 15:4730-41). An apoptosis assay system may comprise a cell that expresses a CRB, and that optionally has defective branching morphogenesis function. A test agent can be added to the apoptosis assay system and changes in induction of apoptosis relative to controls where no test agent is added, identify candidate branching morphogenesis modulating agents. In some embodiments of the invention, an apoptosis assay may be used as a secondary assay to test a candidate branching morphogenesis modulating agent that is initially identified using a cell-free assay system. An apoptosis assay may also be used to test whether CRB function plays a direct role in apoptosis. For example, an apoptosis assay may be performed on cells that over- or under-express CRB relative to wild type cells. Differences in apoptotic response compared to wild type cells suggests that the CRB plays a direct role in the apoptotic response. Apoptosis assays are described further in U.S. Pat. No. 6,133,437.

[0088] Cell proliferation and cell cycle assays. Cell proliferation may be assayed via bromodeoxyuridine (BRDU) incorporation. This assay identifies a cell population undergoing DNA synthesis by incorporation of BRDU into newly-synthesized DNA. Newly-synthesized DNA may then be detected using an anti-BRDU antibody (Hoshino et al., 1986, Int. J. Cancer 38, 369; Campana et al., 1988, J. Immunol. Meth. 107, 79), or by other means.

[0089] Cell Proliferation may also be examined using [³H]-thymidine incorporation (Chen, J., 1996, Oncogene 13:1395-403; Jeoung, J., 1995, J. Biol. Chem. 270:18367-73). This assay allows for quantitative characterization of S-phase DNA syntheses. In this assay, cells synthesizing DNA will incorporate [³H]-thymidine into newly synthesized DNA. Incorporation can then be measured by standard techniques such as by counting of radioisotope in a scintillation counter (e.g., Beckman LS 3800 Liquid Scintillation Counter). Another proliferation assay uses the dye Alamar Blue (available from Biosource International), which fluoresces when reduced in living cells and provides an indirect measurement of cell number (Voytik-Harbin S L et al., 1998, In Vitro Cell Dev Biol Anim 34:239-46).

[0090] Cell proliferation may also be assayed by colony formation in soft agar (Sambrook et al., Molecular Cloning, Cold Spring Harbor (1989)). For example, cells transformed with CRB are seeded in soft agar plates, and colonies are measured and counted after two weeks incubation.

[0091] Involvement of a gene in the cell cycle may be assayed by flow cytometry (Gray J W et al. (1986) Int J Radiat Biol Relat Stud Phys Chem Med 49:237-55). Cells transfected with a CRB may be stained with propidium iodide and evaluated in a flow cytometer (available from Becton Dickinson), which indicates accumulation of cells in different stages of the cell cycle.

[0092] Accordingly, a cell proliferation or cell cycle assay system may comprise a cell that expresses a CRB, and that optionally has defective branching morphogenesis function. A test agent can be added to the assay system and changes in cell proliferation or cell cycle relative to controls where no test agent is added, identify candidate branching morphogenesis modulating agents. In some embodiments of the invention, the cell proliferation or cell cycle assay may be used as a secondary assay to test a candidate branching morphogenesis modulating agents that is initially identified using another assay system such as a cell-free assay system. A cell proliferation assay may also be used to test whether CRB function plays a direct role in cell proliferation or cell cycle. For example, a cell proliferation or cell cycle assay may be performed on cells that over- or under-express CRB relative to wild type cells. Differences in proliferation or cell cycle compared to wild type cells suggests that the CRB plays a direct role in cell proliferation or cell cycle.

[0093] Angiogenesis. Angiogenesis may be assayed using various human endothelial cell systems, such as umbilical vein, coronary artery, or dermal cells. Suitable assays include Alamar Blue based assays (available from Biosource International) to measure proliferation; migration assays using fluorescent molecules, such as the use of Becton Dickinson Falcon HTS FluoroBlock cell culture inserts to measure migration of cells through membranes in presence or absence of angiogenesis enhancer or suppressors; and tubule formation assays based on the formation of tubular structures by endothelial cells on Matrigel® (Becton Dickinson). Accordingly, an angiogenesis assay system may comprise a cell that expresses a CRB, and that optionally has defective branching morphogenesis function. A test agent can be added to the angiogenesis assay system and changes in angiogenesis relative to controls where no test agent is added, identify candidate branching morphogenesis modulating agents. In some embodiments of the invention, the angiogenesis assay may be used as a secondary assay to test a candidate branching morphogenesis modulating agents that is initially identified using another assay system. An angiogenesis assay may also be used to test whether CRB function plays a direct role in cell proliferation. For example, an angiogenesis assay may be performed on cells that over- or under-express CRB relative to wild type cells. Differences in angiogenesis compared to wild type cells suggests that the CRB plays a direct role in angiogenesis. U.S. Pat. Nos. 5,976,782, 6,225,118 and 6,444,434, among others.

[0094] Hypoxic induction. The alpha subunit of the transcription factor, hypoxia inducible factor-1 (HIF-1), is upregulated in tumor cells following exposure to hypoxia in vitro. Under hypoxic conditions, HIF-1 stimulates the expression of genes known to be important in tumour cell survival, such as those encoding glyolytic enzymes and VEGF. Induction of such genes by hypoxic conditions may be assayed by growing cells transfected with CRB in hypoxic conditions (such as with 0.1% O2, 5% CO2, and balance N2, generated in a Napco 7001 incubator (Precision Scientific)) and normoxic conditions, followed by assessment of gene activity or expression by Taqman®. For example, a hypoxic induction assay system may comprise a cell that expresses a CRB, and that optionally has defective branching morphogenesis. A test agent can be added to the hypoxic induction assay system and changes in hypoxic response relative to controls where no test agent is added, identify candidate branching morphogenesis modulating agents. In some embodiments of the invention, the hypoxic induction assay may be used as a secondary assay to test a candidate branching morphogenesis modulating agents that is initially identified using another assay system. A hypoxic induction assay may also be used to test whether CRB function plays a direct role in the hypoxic response. For example, a hypoxic induction assay may be performed on cells that over- or under-express CRB relative to wild type cells. Differences in hypoxic response compared to wild type cells suggests that the CRB plays a direct role in hypoxic induction.

[0095] Cell adhesion. Cell adhesion assays measure adhesion of cells to purified adhesion proteins, or adhesion of cells to each other, in presence or absence of candidate modulating agents. Cell-protein adhesion assays measure the ability of agents to modulate the adhesion of cells to purified proteins. For example, recombinant proteins are produced, diluted to 2.5 g/mL in PBS, and used to coat the wells of a microtiter plate. The wells used for negative control are not coated. Coated wells are then washed, blocked with 1% BSA, and washed again. Compounds are diluted to 2×final test concentration and added to the blocked, coated wells. Cells are then added to the wells, and the unbound cells are washed off. Retained cells are labeled directly on the plate by adding a membrane-permeable fluorescent dye, such as calcein-AM, and the signal is quantified in a fluorescent microplate reader.

[0096] Cell-cell adhesion assays measure the ability of agents to modulate binding of cell adhesion proteins with their native ligands. These assays use cells that naturally or recombinantly express the adhesion protein of choice. In an exemplary assay, cells expressing the cell adhesion protein are plated in wells of a multiwell plate. Cells expressing the ligand are labeled with a membrane-permeable fluorescent dye, such as BCECF, and allowed to adhere to the monolayers in the presence of candidate agents. Unbound cells are washed off, and bound cells are detected using a fluorescence plate reader.

[0097] High-throughput cell adhesion assays have also been described. In one such assay, small molecule ligands and peptides are bound to the surface of microscope slides using a microarray spotter, intact cells are then contacted with the slides, and unbound cells are washed off. In this assay, not only the binding specificity of the peptides and modulators against cell lines are determined, but also the functional cell signaling of attached cells using immunofluorescence techniques in situ on the microchip is measured (Falsey J R et al., Bioconjug Chem. May-June 2001; 12(3):346-53).

[0098] Tubulogenesis. Tubulogenesis assays monitor the ability of cultured cells, generally endothelial cells, to form tubular structures on a matrix substrate, which generally simulates the environment of the extracellular matrix. Exemplary substrates include Matrigel™ (Becton Dickinson), an extract of basement membrane proteins containing laminin, collagen IV, and heparin sulfate proteoglycan, which is liquid at 4° C. and forms a solid gel at 37° C. Other suitable matrices comprise extracellular components such as collagen, fibronectin, and/or fibrin. Cells are stimulated with a pro-angiogenic stimulant, and their ability to form tubules is detected by imaging. Tubules can generally be detected after an overnight incubation with stimuli, but longer or shorter time frames may also be used. Tube formation assays are well known in the art (e.g., Jones M K et al., 1999, Nature Medicine 5:1418-1423). These assays have traditionally involved stimulation with serum or with the growth factors FGF or VEGF. Serum represents an undefined source of growth factors. In a preferred embodiment, the assay is performed with cells cultured in serum free medium, in order to control which process or pathway a candidate agent modulates. Moreover, we have found that different target genes respond differently to stimulation with different pro-angiogenic agents, including inflammatory angiogenic factors such as TNF-alpa. Thus, in a further preferred embodiment, a tubulogenesis assay system comprises testing a CRB's response to a variety of factors, such as FGF, VEGF, phorbol myristate acetate (PMA), TNF-alpha, ephrin, etc.

[0099] Cell Migration. An invasion/migration assay (also called a migration assay) tests the ability of cells to overcome a physical barrier and to migrate towards pro-angiogenic signals. Migration assays are known in the art (e.g., Paik J H et al., 2001, J Biol Chem 276:11830-11837). In a typical experimental set-up, cultured endothelial cells are seeded onto a matrix-coated porous lamina, with pore sizes generally smaller than typical cell size. The matrix generally simulates the environment of the extracellular matrix, as described above. The lamina is typically a membrane, such as the transwell polycarbonate membrane (Corning Costar Corporation, Cambridge, Mass.), and is generally part of an upper chamber that is in fluid contact with a lower chamber containing pro-angiogenic stimuli. Migration is generally assayed after an overnight incubation with stimuli, but longer or shorter time frames may also be used. Migration is assessed as the number of cells that crossed the lamina, and may be detected by staining cells with hemotoxylin solution (VWR Scientific, South San Francisco, Calif.), or by any other method for determining cell number. In another exemplary set up, cells are fluorescently labeled and migration is detected using fluorescent readings, for instance using the Falcon HTS FluoroBlok (Becton Dickinson). While some migration is observed in the absence of stimulus, migration is greatly increased in response to pro-angiogenic factors. As described above, a preferred assay system for migration/invasion assays comprises testing a CRB's response to a variety of pro-angiogenic factors, including tumor angiogenic and inflammatory angiogenic agents, and culturing the cells in serum free medium.

[0100] Sprouting assay. A sprouting assay is a three-dimensional in vitro angiogenesis assay that uses a cell-number defined spheroid aggregation of endothelial cells (“spheroid”), embedded in a collagen gel-based matrix. The spheroid can serve as a starting point for the sprouting of capillary-like structures by invasion into the extracellular matrix (termed “cell sprouting”) and the subsequent formation of complex anastomosing networks (Korff and Augustin, 1999, J Cell Sci 112:3249-58). In an exemplary experimental set-up, spheroids are prepared by pipetting 400 human umbilical vein endothelial cells into individual wells of a nonadhesive 96-well plates to allow overnight spheroidal aggregation (Korff and Augustin: J Cell Biol 143: 1341-52, 1998). Spheroids are harvested and seeded in 900 μl of methocel-collagen solution and pipetted into individual wells of a 24 well plate to allow collagen gel polymerization. Test agents are added after 30 min by pipetting 100 μl of 10-fold concentrated working dilution of the test substances on top of the gel. Plates are incubated at 37° C. for 24 h. Dishes are fixed at the end of the experimental incubation period by addition of paraformaldehyde. Sprouting intensity of endothelial cells can be quantitated by an automated image analysis system to determine the cumulative sprout length per spheroid.

[0101] Primary Assays for Antibody Modulators

[0102] For antibody modulators, appropriate primary assays test is a binding assay that tests the antibody's affinity to and specificity for the CRB protein. Methods for testing antibody affinity and specificity are well known in the art (Harlow and Lane, 1988, 1999, supra). The enzyme-linked immunosorbant assay (ELISA) is a preferred method for detecting CRB-specific antibodies; others include FACS assays, radioimmunoassays, and fluorescent assays.

[0103] In some cases, screening assays described for small molecule modulators may also be used to test antibody modulators.

[0104] Primary Assays for Nucleic Acid Modulators

[0105] For nucleic acid modulators, primary assays may test the ability of the nucleic acid modulator to inhibit or enhance CRB gene expression, preferably mRNA expression. In general, expression analysis comprises comparing CRB expression in like populations of cells (e.g., two pools of cells that endogenously or recombinantly express CRB) in the presence and absence of the nucleic acid modulator. Methods for analyzing mRNA and protein expression are well known in the art. For instance, Northern blotting, slot blotting, ribonuclease protection, quantitative RT-PCR (e.g., using the TaqMan®, PE Applied Biosystems), or microarray analysis may be used to confirm that CRB mRNA expression is reduced in cells treated with the nucleic acid modulator (e.g., Current Protocols in Molecular Biology (1994) Ausubel F M et al., eds., John Wiley & Sons, Inc., chapter 4; Freeman W M et al., Biotechniques (1999) 26:112-125; Kallioniemi O P, Ann Med 2001, 33:142-147; Blohm D H and Guiseppi-Elie, A Curr Opin Biotechnol 2001, 12:41-47). Protein expression may also be monitored. Proteins are most commonly detected with specific antibodies or antisera directed against either the CRB protein or specific peptides. A variety of means including Western blotting, ELISA, or in situ detection, are available (Harlow E and Lane D, 1988 and 1999, supra).

[0106] In some cases, screening assays described for small molecule modulators, particularly in assay systems that involve CRB mRNA expression, may also be used to test nucleic acid modulators.

[0107] Secondary Assays

[0108] Secondary assays may be used to further assess the activity of CRB-modulating agent identified by any of the above methods to confirm that the modulating agent affects CRB in a manner relevant to branching morphogenesis. As used herein, CRB-modulating agents encompass candidate clinical compounds or other agents derived from previously identified modulating agent. Secondary assays can also be used to test the activity of a modulating agent on a particular genetic or biochemical pathway or to test the specificity of the modulating agent's interaction with CRB.

[0109] Secondary assays generally compare like populations of cells or animals (e.g., two pools of cells or animals that endogenously or recombinantly express CRB) in the presence and absence of the candidate modulator. In general, such assays test whether treatment of cells or animals with a candidate CRB-modulating agent results in changes in branching morphogenesis in comparison to untreated (or mock- or placebo-treated) cells or animals. Certain assays use “sensitized genetic backgrounds”, which, as used herein, describe cells or animals engineered for altered expression of genes in the branching morphogenesis or interacting pathways.

[0110] Cell-based Assays

[0111] Cell based assays may use a variety of mammalian cell types. Preferred cells are capable of branching morphogenesis processes and are generally endothelial cells. Exemplary cells include human umbilical vein endothelial cells (HUVECs), human renal microvascular endothelial cells (HRMECs), human dermal microvascular endothelial cells (HDMECs), human uterine microvascular endothelial cells, human lung microvascular endothelial cells, human coronary artery endothelial cells, and immortalized microvascular cells, among others. Cell based assays may rely on the endogenous expression of CRB and/or other genes, such as those involved in branching morphogenesis, or may involve recombinant expression of these genes. Candidate modulators are typically added to the cell media but may also be injected into cells or delivered by any other efficacious means.

[0112] Cell-based assays may detect a variety of events associated with branching morphogenesis and angiogenesis, including cell proliferation, apoptosis, cell migration, tube formation, sprouting and hypoxic induction, as described above.

[0113] Animal Assays

[0114] A variety of non-human animal models of branching morphogenesis, including angiogenesis, and related pathologies may be used to test candidate CRB modulators. Animal assays may rely on the endogenous expression of CRB and/or other genes, such as those involved in branching morphogenesis, or may involve engineered expression of these genes. In some cases, CRB expression or CRB protein may be restricted to a particular implanted tissue or matrix. Animal assays generally require systemic delivery of a candidate modulator, such as by oral administration, injection (intravenous, subcutaneous, intraperitoneous), bolus administration, etc.

[0115] In a preferred embodiment, branching morphogenesis activity is assessed by monitoring neovascularization and angiogenesis. Animal models with defective and normal branching morphogenesis are used to test the candidate modulator's affect on CRB in Matrigel® assays. Matrigel® is an extract of basement membrane proteins, and is composed primarily of laminin, collagen IV, and heparin sulfate proteoglycan. It is provided as a sterile liquid at 4° C., but rapidly forms a solid gel at 37° C. Liquid Matrigel® is mixed with various angiogenic agents, such as bFGF and VEGF, or with human tumor cells which over-express the CRB. The mixture is then injected subcutaneously(SC) into female athymic nude mice (Taconic, Germantown, N.Y.) to support an intense vascular response. Mice with Matrigel® pellets may be dosed via oral (PO), intraperitoneal (IP), or intravenous (IV) routes with the candidate modulator. Mice are euthanized 5-12 days post-injection, and the Matrigel® pellet is harvested for hemoglobin analysis (Sigma plasma hemoglobin kit). Hemoglobin content of the gel is found to correlate the degree of neovascularization in the gel.

[0116] In another preferred embodiment, the effect of the candidate modulator on CRB is assessed via tumorigenicity assays. In one example, a xenograft comprising human cells from a pre-existing tumor or a tumor cell line known to be angiogenic is used; exemplary cell lines include A431, Colo205, MDA-MB-435, A673, A375, Calu-6, MDA-MB-231, 460, SF763T, or SKOV3tp5. Tumor xenograft assays are known in the art (see, e.g., Ogawa K et al., 2000, Oncogene 19:6043-6052). Xenografts are typically implanted SC into female athymic mice, 6-7 week old, as single cell suspensions either from a pre-existing tumor or from in vitro culture. The tumors which express the CRB endogenously are injected in the flank, 1×10⁵ to 1×10⁷ cells per mouse in a volume of 100 μL using a 27 gauge needle. Mice are then ear tagged and tumors are measured twice weekly. Candidate modulator treatment is initiated on the day the mean tumor weight reaches 100 mg. Candidate modulator is delivered IV, SC, IP, or PO by bolus administration. Depending upon the pharmacokinetics of each unique candidate modulator, dosing can be performed multiple times per day. The tumor weight is assessed by measuring perpendicular diameters with a caliper and calculated by multiplying the measurements of diameters in two dimensions. At the end of the experiment, the excised tumors maybe utilized for biomarker identification or further analyses. For immunohistochemistry staining, xenograft tumors are fixed in 4% paraformaldehyde, 0.1M phosphate, pH 7.2, for 6 hours at 4° C., immersed in 30% sucrose in PBS, and rapidly frozen in isopentane cooled with liquid nitrogen.

[0117] In another preferred embodiment, tumorogenicity is monitored using a hollow fiber assay, which is described in U.S. Pat. No. 5,698,413. Briefly, the method comprises implanting into a laboratory animal a biocompatible, semi-permeable encapsulation device containing target cells, treating the laboratory animal with a candidate modulating agent, and evaluating the target cells for reaction to the candidate modulator. Implanted cells are generally human cells from a pre-existing tumor or a tumor cell line known to be angiogenic. After an appropriate period of time, generally around six days, the implanted samples are harvested for evaluation of the candidate modulator. Tumorogenicity and modulator efficacy may be evaluated by assaying the quantity of viable cells present in the macrocapsule, which can be determined by tests known in the art, for example, MTT dye conversion assay, neutral red dye uptake, trypan blue staining, viable cell counts, the number of colonies formed in soft agar, the capacity of the cells to recover and replicate in vitro, etc. Other assays specific to angiogenesis, as are known in the art and described herein, may also be used.

[0118] In another preferred embodiment, a tumorogenicity assay use a transgenic animal, usually a mouse, carrying a dominant oncogene or tumor suppressor gene knockout under the control of tissue specific regulatory sequences; these assays are generally referred to as transgenic tumor assays. In a preferred application, tumor development in the transgenic model is well characterized or is controlled. In an exemplary model, the “RIP1-Tag2” transgene, comprising the SV40 large T-antigen oncogene under control of the insulin gene regulatory regions is expressed in pancreatic beta cells and results in islet cell carcinomas (Hanahan D, 1985, Nature 315:115-122; Parangi S et al, 1996, Proc Natl Acad Sci USA 93: 2002-2007; Bergers G et al, 1999, Science 284:808-812). An “angiogenic switch,” occurs at approximately five weeks, as normally quiescent capillaries in a subset of hyperproliferative islets become angiogenic. The RIP1-TAG2 mice die by age 14 weeks. Candidate modulators may be administered at a variety of stages, including just prior to the angiogenic switch (e.g., for a model of tumor prevention), during the growth of small tumors (e.g., for a model of intervention), or during the growth of large and/or invasive tumors (e.g., for a model of regression). Tumorogenicity and modulator efficacy can be evaluating life-span extension and/or tumor characteristics, including number of tumors, tumor size, tumor morphology, vessel density, apoptotic index, etc.

[0119] Diagnostic and Therapeutic Uses

[0120] Specific CRB-modulating agents are useful in a variety of diagnostic and therapeutic applications where disease or disease prognosis is related to defects in branching morphogenesis, such as angiogenic, apoptotic, or cell proliferation disorders. Accordingly, the invention also provides methods for modulating branching morphogenesis in a cell, preferably a cell pre-determined to have defective or impaired branching morphogenesis function (e.g. due to overexpression, underexpression, or misexpression of branching morphogenesis, or due to gene mutations), comprising the step of administering an agent to the cell that specifically modulates CRB activity. Preferably, the modulating agent produces a detectable phenotypic change in the cell indicating that the branching morphogenesis function is restored. The phrase “function is restored”, and equivalents, as used herein, means that the desired phenotype is achieved, or is brought closer to normal compared to untreated cells. For example, with restored branching morphogenesis function, cell proliferation and/or progression through cell cycle may normalize, or be brought closer to normal relative to untreated cells. The invention also provides methods for treating disorders or disease associated with impaired branching morphogenesis function by administering a therapeutically effective amount of a CRB-modulating agent that modulates branching morphogenesis. The invention further provides methods for modulating CRB function in a cell, preferably a cell pre-determined to have defective or impaired CRB function, by administering a CRB-modulating agent. Additionally, the invention provides a method for treating disorders or disease associated with impaired CRB function by administering a therapeutically effective amount of a CRB-modulating agent.

[0121] The discovery that CRB is implicated in branching morphogenesis provides for a variety of methods that can be employed for the diagnostic and prognostic evaluation of diseases and disorders involving defects in branching morphogenesis and for the identification of subjects having a predisposition to such diseases and disorders.

[0122] Various expression analysis methods can be used to diagnose whether CRB expression occurs in a particular sample, including Northern blotting, slot blotting, ribonuclease protection, quantitative RT-PCR, and microarray analysis. (e.g., Current Protocols in Molecular Biology (1994) Ausubel F M et al., eds., John Wiley & Sons, Inc., chapter 4; Freeman W M et al., Biotechniques (1999) 26:112-125; Kallioniemi O P, Ann Med 2001, 33:142-147; Blohm and Guiseppi-Elie, Curr Opin Biotechnol 2001, 12:41-47). Tissues having a disease or disorder implicating defective branching morphogenesis signaling that express a CRB, are identified as amenable to treatment with a CRB modulating agent. In a preferred application, the branching morphogenesis defective tissue overexpresses a CRB relative to normal tissue. For example, a Northern blot analysis of mRNA from tumor and normal cell lines, or from tumor and matching normal tissue samples from the same patient, using full or partial CRB cDNA sequences as probes, can determine whether particular tumors express or overexpress CRB. Alternatively, the TaqMan® is used for quantitative RT-PCR analysis of CRB expression in cell lines, normal tissues and tumor samples (PE Applied Biosystems).

[0123] Various other diagnostic methods may be performed, for example, utilizing reagents such as the CRB oligonucleotides, and antibodies directed against a CRB, as described above for: (1) the detection of the presence of CRB gene mutations, or the detection of either over- or under-expression of CRB mRNA relative to the non-disorder state; (2) the detection of either an over- or an under-abundance of CRB gene product relative to the non-disorder state; and (3) the detection of perturbations or abnormalities in the signal transduction pathway mediated by CRB.

[0124] Thus, in a specific embodiment, the invention is drawn to a method for diagnosing a disease or disorder in a patient that is associated with alterations in CRB expression, the method comprising: a) obtaining a biological sample from the patient; b) contacting the sample with a probe for CRB expression; c) comparing results from step (b) with a control; and d) determining whether step (c) indicates a likelihood of the disease or disorder. Preferably, the disease is cancer, most preferably colon, kidney, uterus, prostate, or skin cancer. The probe may be either DNA or protein, including an antibody.

EXAMPLES

[0125] The following experimental section and examples are offered by way of illustration and not by way of limitation.

[0126] I. Drosophila Assays

[0127] Genetic screens were designed to identify modifiers of branching morphogenesis in Drosophila. Briefly, Drosophila embryos (approximately stage 16) that were homozygous for lethal insertions of a piggyBac (Fraser M et al., Virology (1985) 145:356-361) or P-element transposon were screened for tracheal defects using monoclonal antibody 2A12 (Samakovlis C, et al., Development (1996) 122:1395-1407; Patel N H. (1994) Practical Uses in Cell and Molecular Biology. Eds L S B Goldstein and E A Fryberg. Vol 44 pp446-488. San Diego Academic Press). Sequence information surrounding the transposon insertion site was used to identify the gene mutated by the insertion. The homozygous disruption of the Drosophila CRUMBS was identified as associated with tracheal defects.

[0128] BLAST analysis (Altschul et al., supra) was employed to identify Targets from Drosophila modifiers. For example, arepresentative sequence from CRB, GI# 18175295 (SEQ ID NO:15) shares 31% amino acid identity with the Drosophila CRUMBS.

[0129] Various domains, signals, and functional subunits in proteins were analyzed using the PSORT (Nakai K., and Horton P., Trends Biochem Sci, 1999, 24:34-6; Kenta Nakai, Protein sorting signals and prediction of subcellular localization, Adv. Protein Chem. 54, 277-344 (2000)), PFAM (Bateman A., et al., Nucleic Acids Res, 1999, 27:260-2), SMART (Ponting C P, et al., SMART: identification and annotation of domains from signaling and extracellular protein sequences. Nucleic Acids Res. Jan. 1, 1999;27(1):229-32), TM-HMM (Erik L. L. Sonnhammer, Gunnar von Heijne, and Anders Krogh: A hidden Markov model for predicting transmembrane helices in protein sequences. In Proc. of Sixth Int. Conf. on Intelligent Systems for Molecular Biology, p 175-182 Ed J. Glasgow, T. Littlejohn, F. Major, R. Lathrop, D. Sankoff, and C. Sensen Menlo Park, Calif.: AAAI Press, 1998), and clust (Remm M, and Sonnhammer E. Classification of transmembrane protein families in the Caenorhabditis elegans genome and identification of human orthologs. Genome Res. November 2000;10(11):1679-89) programs. For example, SEQ ID NO:17 has EGF-like domains (PFAM PF00008) at approximate amino acid positions 93 to 127, 134 to 165, 172 to 203, 210 to 242, 249 to 280, 287 to 339, 346 to 377, 384 to 415, 422 to 457, 631 to 662, 833 to 864, 1082 to 11 13, 1120 to 1151, 1160 to 1192, and 1199 to 1230; and laminin G domains (PFAM PF00054) at approximate amino acid residues 487 to 613, and 922 to 1047. Likewise, SEQ ID NO:15 has EGF-like domains (PFAM PF00008) at approximate amino acid positions 34 to 67, 74 to 107, 114 to 145, 152 to 183, 190 to 221, 228 to 259, 266 to 298, 305 to 336, 343 to 394, 401 to 438, 445 to 480, 676 to 707, 891 to 922, 1143 to 1174, 1181 to 1211, 1218 to 1249, 1259 to 1294, and 1301 to 1332; and laminin G domains (PFAM PF00054) at approximate amino acid residues 514 to 654, 743 to 863, and 980 to 1108.

[0130] TMHMM transmembrane analysis indicates that CRBs are transmembrane proteins with transmembrane domains approximately at amino acid positions 5-22 and 1346-1368 for SEQ ID NO:14, 1247-1269 for SEQ ID NO:17, and 5-22 and 57-79 for SEQ ID NO:16.

[0131] II. Proliferation Assay

[0132] Human umbilical endothelial cells (HMVEC) are maintained at 37° C. in flasks or plates coated with 1.5% porcine skin gelatin (300 bloom, Sigma) in Growth medium (Clonetics Corp.) supplemented with 10-20% fetal bovine serum (FBS, Hyclone). Cells are grown to confluency and used up to the seventh passage. Stimulation medium consists of 50% Sigma 99 media and 50% RPMI 1640 with L-glutamine and additional supplementation with 10 μg/ml insulin-transferrin-selenium (Gibco BRL) and 10% FBS. Cell growth is stimulated by incubation in Stimulation medium supplemented with 20 ng/ml of VEGF. Cell culture assays are carried out in triplicate. Cells are transfected with a mixture of 10 μg of pSV7d expression vectors carrying the CRB or the CRB coding sequences and 1 μg of pSV2 expression vector carrying the neo resistance gene with the Lipofectin reagent (Life Technologies, Inc.). Stable integrants are selected using 500 μg/ml G418; cloning was carried out by colony isolation using a Pasteur pipette. Transformants are screened by their ability to specifically bind iodinated VEGF. Proliferation assays are performed on growth-arrested cells seeded in 24-well cluster plates. The cell monolayers are incubated in serum-free medium with the modulators and 1 μCi of [3H]thymidine (47 Ci/mmol) for 4 h. The insoluble material is precipitated for 10 min with 10% trichloroacetic acid, neutralized, and dissolved in 0.2 M NaOH, and the radioactivity is counted in a scintillation counter.

[0133] III. High-Throughput In Vitro Fluorescence Polarization Assay

[0134] Fluorescently-labeled CRB peptide/substrate are added to each well of a 96-well microtiter plate, along with a test agent in a test buffer (10 mM HEPES, 10 mM NaCl, 6 mM magnesium chloride, pH 7.6). Changes in fluorescence polarization, determined by using a Fluorolite FPM-2 Fluorescence Polarization Microtiter System (Dynatech Laboratories, Inc), relative to control values indicates the test compound is a candidate modifier of CRB activity.

[0135] IV. High-Throughput In Vitro Binding Assay.

[0136]³³P-labeled CRB peptide is added in an assay buffer (100 mM KCl, 20 mM HEPES pH 7.6, 1 mM MgCl₂, 1% glycerol, 0.5% NP-40, 50 mM beta-mercaptoethanol, 1 mg/ml BSA, cocktail of protease inhibitors) along with a test agent to the wells of a Neutralite-avidin coated assay plate and incubated at 25° C. for 1 hour. Biotinylated substrate is then added to each well and incubated for 1 hour. Reactions are stopped by washing with PBS, and counted in a scintillation counter. Test agents that cause a difference in activity relative to control without test agent are identified as candidate branching morphogenesis modulating agents.

[0137] V. Immunoprecipitations and Immunoblotting

[0138] For coprecipitation of transfected proteins, 3×10⁶ appropriate recombinant cells containing the CRB proteins are plated on 10-cm dishes and transfected on the following day with expression constructs. The total amount of DNA is kept constant in each transfection by adding empty vector. After 24 h, cells are collected, washed once with phosphate-buffered saline and lysed for 20 min on ice in 1 ml of lysis buffer containing 50 mM Hepes, pH 7.9, 250 mM NaCl, 20 mM-glycerophosphate, 1 mM sodium orthovanadate, 5 mM p-nitrophenyl phosphate, 2 mM dithiothreitol, protease inhibitors (complete, Roche Molecular Biochemicals), and 1% Nonidet P-40. Cellular debris is removed by centrifugation twice at 15,000×g for 15 min. The cell lysate is incubated with 25 μl of M2 beads (Sigma) for 2 h at 4° C. with gentle rocking.

[0139] After extensive washing with lysis buffer, proteins bound to the beads are solubilized by boiling in SDS sample buffer, fractionated by SDS-polyacrylamide gel electrophoresis, transferred to polyvinylidene difluoride membrane and blotted with the indicated antibodies. The reactive bands are visualized with horseradish peroxidase coupled to the appropriate secondary antibodies and the enhanced chemiluminescence (ECL) Western blotting detection system (Amersham Pharmacia Biotech).

[0140] VI. Expression Analysis

[0141] All cell lines used in the following experiments are NCI (National Cancer Institute) lines, and are available from ATCC (American Type Culture Collection, Manassas, Va. 20110-2209). Normal and tumor tissues were obtained from Impath, U C Davis, Clontech, Stratagene, and Ambion.

[0142] TaqMan analysis was used to assess expression levels of the disclosed genes in various samples.

[0143] RNA was extracted from each tissue sample using Qiagen (Valencia, Calif.) RNeasy kits, following manufacturer's protocols, to a final concentration of 50 ng/μl. Single stranded cDNA was then synthesized by reverse transcribing the RNA samples using random hexamers and 500 ng of total RNA per reaction, following protocol 4304965 of Applied Biosystems (Foster City, Calif.).

[0144] Primers for expression analysis using TaqMan assay (Applied Biosystems, Foster City, Calif.) were prepared according to the TaqMan protocols, and the following criteria: a) primer pairs were designed to span introns to eliminate genomic contamination, and b) each primer pair produced only one product.

[0145] Taqman reactions were carried out following manufacturer's protocols, in 25 μl total volume for 96-well plates and 10 μl total volume for 384-well plates, using 300 nM primer and 250 nM probe, and approximately 25 ng of cDNA. The standard curve for result analysis was prepared using a universal pool of human cDNA samples, which is a mixture of cDNAs from a wide variety of tissues so that the chance that a target will be present in appreciable amounts is good. The raw data were normalized using 18S rRNA (universally expressed in all tissues and cells).

[0146] For each expression analysis, tumor tissue samples were compared with matched normal tissues from the same patient. A gene was considered overexpressed in a tumor when the level of expression of the gene was 2 fold or higher in the tumor compared with its matched normal sample. In cases where normal tissue was not available, a universal pool of cDNA samples was used instead. In these cases, a gene was considered overexpressed in a tumor sample when the difference of expression levels between a tumor sample and the average of all normal samples from the same tissue type was greater than 2 times the standard deviation of all normal samples (i.e., Tumor−average(all normal samples)>2×STDEV(all normal samples)).

[0147] For matched tumor and normal tissue samples, CRB1 (SEQ ID NO:3) was overexpressed in 27% of colon cancers (33 pairs), 25% of kidney cancers (24 pairs), 26% of uterine cancers (19 pairs), 25% of prostate cancers (12 pairs), and 33% of skin cancers (3 pairs). A modulator identified by an assay described herein can be further validated for therapeutic effect by administration to a tumor in which the gene is overexpressed. A decrease in tumor growth confirms therapeutic utility of the modulator. Prior to treating a patient with the modulator, the likelihood that the patient will respond to treatment can be diagnosed by obtaining a tumor sample from the patient, and assaying for expression of the gene targeted by the modulator. The expression data for the gene(s) can also be used as a diagnostic marker for disease progression. The assay can be performed by expression analysis as described above, by antibody directed to the gene target, or by any other available detection method.

[0148] VII. The Crumbs Protein Family and their Potential Function in the β-catenin Mediated Regulation of Cell Polarity in Colorectal Tumors

[0149] Since loss of polarity is thought to be a key driver in the formation colon tumors upon APC loss of heterozigosity we investigated whether Crumbs might be contributing to the polarity defect. In the Dmel-2 Drosophila cell line we have shown using DNA arrays and real-time PCR (Taqman™) that the Crumbs gene is transcriptionally upregulated upon activation of the β-catenin pathway through axin or GSK3-beta loss of function. In this system, Crumbs up-regulation is dependent on the integrity of the TCF transcriptional activation complex, meaning that transcriptional activation of β-catenin is related to the wnt signaling pathway. Moreover we found that overexpression of Crumbs in the Drosophila eye causes loss of polarity in the photoreceptors and results in a phenotype that is identical to the phenotype of APC loss of function (Ahmed Y et al. (1998) Cell 93:1171-1182). Interestingly, the protein CRB3 (SEQ ID NO:16) is expressed in a colorectal cancer cell line (Caco2) and can bind to the PDZ domain of a putative human ortholog of Discs lost (Lemmers C et al. (2002) Journal of Biological Chemistry 277:25408-25415). Taken together, these data suggest that the loss of polarity observed upon activation of the β-catenin pathway in colorectal cancer cells might be in part due to the transcriptional induction of Crumbs proteins.

[0150] The relationship of β-catenin to branching morphogenesis, including vasculogenesis and angiogenesis, is well-established (Venkiteswaran K, et al (2002) Am J Physiol Cell Physiol. 283:C81 1-21; Petzelbauer P et al (2000) J Investig Dermatol Symp Proc5:10-13; Yano H et al (2000) Neurol Res. 2000 22:650-6; Yano H et al (2000) Neurol Res. 22:527-32; Carmeliet P et al (1999) Cell. 98:147-57) and we have found that the Crumbs protein is required for branching morphogenesis in flies. We propose here, that the Crumbs protein family might also constitute attractive therapeutic antibody targets for the colorectal cancer associated with APC mutations. We are currently testing whether the human Crumbs proteins are overexpressed in human APC tumor samples.

1 17 1 4361 DNA Homo sapiens 1 tgtaagtagg gtgggacaga gatggcacct gggggttctg aggcacccgc tcctctctga 60 gacagacagg gatcaggagc cggactggga ccagaccacc agcaacacac cagaggatgt 120 tctctaaata agaccatggc acttaagaac attaactacc ttctcatctt ctacctcagt 180 ttctcactgc ttatctacat aaaaaattcc ttttgcaata aaaacaacac caggtgcctc 240 tcaaattctt gccaaaacaa ttctacatgc aaagattttt caaaagacaa tgattgttct 300 tgttcagaca cagccaataa tttggacaaa gactgtgaca acatgaaaga cccttgcttc 360 tccaatccct gtcaaggaag tgccacttgt gtgaacaccc caggagaaag gagctttctg 420 tgcaaatgtc ctcctgggta cagtgggaca atctgtgaaa ctaccattgg ttcctgtggc 480 aagaactcct gccaacatgg aggtatttgc catcaggacc ctatttatcc tgtctgcatc 540 tgccctgctg gatatgctgg aagattctgt gagatagatc acgatgagtg tgcttccagc 600 ccttgccaaa atggggccgt gtgccaggat ggaattgatg gttactcctg cttctgtgtc 660 ccaggatatc aaggcagaca ctgcgacttg gaagtggatg aatgtgcttc agatccctgc 720 aagaacgagg ctacatgcct caatgaaata ggaagatata cttgtatctg tccccacaat 780 tattctggtg taaactgtga attggaaatt gacgaatgtt ggtcccagcc ttgtttaaat 840 ggtgcaactt gtcaggatgc tctgggggcc tatttctgcg actgtgcccc tggattcctg 900 ggggatcact gtgaactcaa cactgatgag tgtgccagtc aaccttgtct ccatggaggg 960 ctgtgtgtgg atggagaaaa cagatatagc tgtaactgca cgggtagtgg attcacaggg 1020 acacactgtg agaccttgat gcctctttgt tggtcaaaac cttgtcacaa taatgctaca 1080 tgtgaggaca gtgttgacaa ttacacttgt cactgctggc ctggatacac aggtgcccag 1140 tgtgagatcg acctcaatga atgcaatagt aacccctgcc agtccaatgg ggaatgtgtg 1200 gagctgtcct cagagaaaca atatggacgc atcactggac tgccttcttc tttcagctac 1260 catgaagcct caggttatgt ctgtatctgt cagcctggat tcacaggaat ccactgcgaa 1320 gaagacgtca atgaatgttc ttcaaaccct tgccaaaatg gtggtacttg tgagaacttg 1380 cctgggaatt atacttgcca ttgcccattt gataaccttt ctagaacttt ttatggagga 1440 agggactgtt ctgatattct cctgggctgt acccatcagc aatgtctaaa taatggaaca 1500 tgcatccctc acttccaaga tggccagcat ggattcagct gcctgtgtcc atctggctac 1560 accgggtccc tgtgtgaaat cgcaaccaca ctttcatttg agggcgatgg cttcctgtgg 1620 gtcaaaagtg gctcagtgac aaccaagggc tcagtttgta acatagccct caggtttcag 1680 actgttcagc caatggctct tctacttttc cgaagcaaca gggatgtgtt tgtgaagctg 1740 gagctgctaa gtggctacat tcacttatca attcaggtca ataatcagtc aaaggtgctt 1800 ctgttcattt cccacaacac cagcgatgga gagtggcatt tcgtggaggt aatatttgca 1860 gaggctgtga cccttacctt aatcgacgac tcctgtaagg agaaatgcat cgcgaaagct 1920 cctactccac ttgaaagtga tcaatcaata tgtgcttttc agaactcctt tttgggtggt 1980 ttaccagtgg gaatgaccag caatggtgtt gctctgctta acttctataa tatgccatcc 2040 acaccttcgt ttgtaggctg tctccaagac attaaaattg attggaatca cattaccctg 2100 gagaacatct cgtctggctc atcattaaat gtcaaggcag gctgtgtgag aaaggattgg 2160 tgtgaaagcc aaccttgtca aagcagagga cgctgcatca acttgtggct gagttaccag 2220 tgtgactgcc acaggcccta tgaaggcccc aactgtctga gagagtatgt ggcaggcaga 2280 tttggccagg atgactccac tggttatgtc atctttactc ttgatgagag ctatggagac 2340 accatcagcc tctccatgtt tgtccgaacg cttcaaccat caggcttact tctagctttg 2400 gaaaacagca cttatcaata tatccgtgtc tggctagagc gcggcagact agcaatgctg 2460 actccaaact ctcccaaatt agtagtaaaa tttgttctta atgatggaaa tgtccacttg 2520 atatctttga aaatcaagcc atataaaatt gaactgtatc agtcttcaca aaacctagga 2580 tttatttctg cttctacgtg gaaaatcgaa aagggagatg tcatctacat tggtggccta 2640 cctgacaagc aagagactga acttaatggt ggattcttca aaggctgtat ccaagatgta 2700 agactaaaca accaaaatct ggaattcttt ccaaatccaa caaacaatgc atctctcaat 2760 ccagttcttg tcaatgtaac ccaaggctgt gctggagaca acagctgcaa gtccaacccc 2820 tgtcacaatg gaggtgtttg ccattcccgg tgggatgact tctcctgttc ctgtcctgcc 2880 ctcacaagtg ggaaagcctg tgaggaggtt cagtggtgtg gattcagccc gtgtcctcac 2940 ggagcccagt gccagccggt gcttcaagga tttgaatgta ttgcaaatgc tgtttttaat 3000 ggacaaagcg gtcaaatatt attcagaagc aatgggaata ttaccagaga actcaccaat 3060 atcacatttg gtttcagaac aagggatgca aatgtaataa tattgcatgc agaaaaagag 3120 cctgaatttc ttaatattag cattcaagat tccagattat tctttcaatt gcaaagtggc 3180 aacagctttt atatgctaag tctgacaagt ttgcagtcag tgaatgatgg cacatggcac 3240 gaagtgaccc tttccatgac agacccactg tcccagacct ccaggtggca aatggaagtg 3300 gacaacgaaa caccttttgt gaccagcaca attgctactg gaagcctcaa ctttttgaag 3360 gataatacag atatttatgt gggagacaga gctattgaca atataaaggg cctgcaaggg 3420 tgtctaagta caatagaaat cggaggcatt tatctctctt actttgaaaa tgttcatggt 3480 ttcattaata aacctcagga agagcaattt ctcaaaatct ctaccaattc agtggtcact 3540 ggctgtttgc agttaaatgt ctgcaactcc aacccctgtt tgcatggagg aaactgtgaa 3600 gacatctata gctcttatca ttgctcctgt cccttgggat ggtcagggaa acactgtgaa 3660 ctcaacatcg atgaatgctt ttcaaacccc tgtatccatg gcaactgctc tgacagagtt 3720 gcagcctacc actgcacatg tgagcctgga tacactggtg tgaactgtga agtggatata 3780 gacaactgcc agagtcacca gtgtgcaaat ggagccacct gcattagtca tactaatggc 3840 tattcttgcc tctgttttgg aaattttaca ggaaaatttt gcagacagag cagattaccc 3900 tcaacagtct gtgggaatga gaagacaaat ctcacttgct acaatggagg caactgcaca 3960 gagttccaga ctgaattaaa atgtatgtgc cggccaggtt ttactggaga atggtgtgaa 4020 aaggacattg atgagtgtgc ctctgatccg tgtgtcaatg gaggtctgtg ccaggactta 4080 ctcaacaaat tccagtgcct ctgtgatgtt gcctttgctg gcgagcgctg cgaggtggac 4140 gtaagcagcc tctcctttta tgtctctctc ttattctggc agaatctttt tcagcttctt 4200 tcttacctca ttttgcgtat gaatgacgag ccagttgttg agtggggtga acaggaagat 4260 tattaacata catttgaaca ttcccaaatg aaaaaaaaag ccattgaatt tcaagaaatg 4320 ccttgattca ttttagatct ctggggaaaa aaaaaaaaaa a 4361 2 4950 DNA Homo sapiens 2 tgtaagtagg gtgggacaga gatggcacct gggggttctg aggcacccgc tcctctctga 60 gacagacagg gatcaggagc cggactggga ccagaccacc agcaacacac cagaggatgt 120 tctctaaata agaccatggc acttaagaac attaactacc ttctcatctt ctacctcagt 180 ttctcactgc ttatctacat aaaaaattcc ttttgcaata aaaacaacac caggtgcctc 240 tcaaattctt gccaaaacaa ttctacatgc aaagattttt caaaagacaa tgattgttct 300 tgttcagaca cagccaataa tttggacaaa gactgtgaca acatgaaaga cccttgcttc 360 tccaatccct gtcaaggaag tgccacttgt gtgaacaccc caggagaaag gagctttctg 420 tgcaaatgtc ctcctgggta cagtgggaca atctgtgaaa ctaccattgg ttcctgtggc 480 aagaactcct gccaacatgg aggtatttgc catcaggacc ctatttatcc tgtctgcatc 540 tgccctgctg gatatgctgg aagattctgt gagatagatc acgatgagtg tgcttccagc 600 ccttgccaaa atggggccgt gtgccaggat ggaattgatg gttactcctg cttctgtgtc 660 ccaggatatc aaggcagaca ctgcgacttg gaagtggatg aatgtgcttc agatccctgc 720 aagaacgagg ctacatgcct caatgaaata ggaagatata cttgtatctg tccccacaat 780 tattctggtg taaactgtga attggaaatt gacgaatgtt ggtcccagcc ttgtttaaat 840 ggtgcaactt gtcaggatgc tctgggggcc tatttctgcg actgtgcccc tggattcctg 900 ggggatcact gtgaactcaa cactgatgag tgtgccagtc aaccttgtct ccatggaggg 960 ctgtgtgtgg atggagaaaa cagatatagc tgtaactgca cgggtagtgg attcacaggg 1020 acacactgtg agaccttgat gcctctttgt tggtcaaaac cttgtcacaa taatgctaca 1080 tgtgaggaca gtgttgacaa ttacacttgt cactgctggc ctggatacac aggtgcccag 1140 tgtgagatcg acctcaatga atgcaatagt aacccctgcc agtccaatgg ggaatgtgtg 1200 gagctgtcct cagagaaaca atatggacgc atcactggac tgccttcttc tttcagctac 1260 catgaagcct caggttatgt ctgtatctgt cagcctggat tcacaggaat ccactgcgaa 1320 gaagacgtca atgaatgttc ttcaaaccct tgccaaaatg gtggtacttg tgagaacttg 1380 cctgggaatt atacttgcca ttgcccattt gataaccttt ctagaacttt ttatggagga 1440 agggactgtt ctgatattct cctgggctgt acccatcagc aatgtctaaa taatggaaca 1500 tgcatccctc acttccaaga tggccagcat ggattcagct gcctgtgtcc atctggctac 1560 accgggtccc tgtgtgaaat cgcaaccaca ctttcatttg agggcgatgg cttcctgtgg 1620 gtcaaaagtg gctcagtgac aaccaagggc tcagtttgta acatagccct caggtttcag 1680 actgttcagc caatggctct tctacttttc cgaagcaaca gggatgtgtt tgtgaagctg 1740 gagctgctaa gtggctacat tcacttatca attcaggtca ataatcagtc aaaggtgctt 1800 ctgttcattt cccacaacac cagcgatgga gagtggcatt tcgtggaggt aatatttgca 1860 gaggctgtga cccttacctt aatcgacgac tcctgtaagg agaaatgcat cgcgaaagct 1920 cctactccac ttgaaagtga tcaatcaata tgtgcttttc agaactcctt tttgggtggt 1980 ttaccagtgg gaatgaccag caatggtgtt gctctgctta acttctataa tatgccatcc 2040 acaccttcgt ttgtaggctg tctccaagac attaaaattg attggaatca cattaccctg 2100 gagaacatct cgtctggctc atcattaaat gtcaaggcag gctgtgtgag aaaggattgg 2160 tgtgaaagcc aaccttgtca aagcagagga cgctgcatca acttgtggct gagttaccag 2220 tgtgactgcc acaggcccta tgaaggcccc aactgtctga gagagtatgt ggcaggcaga 2280 tttggccagg atgactccac tggttatgtc atctttactc ttgatgagag ctatggagac 2340 accatcagcc tctccatgtt tgtccgaacg cttcaaccat caggcttact tctagctttg 2400 gaaaacagca cttatcaata tatccgtgtc tggctagagc gcggcagact agcaatgctg 2460 actccaaact ctcccaaatt agtagtaaaa tttgttctta atgatggaaa tgtccacttg 2520 atatctttga aaatcaagcc atataaaatt gaactgtatc agtcttcaca aaacctagga 2580 tttatttctg cttctacgtg gaaaatcgaa aagggagatg tcatctacat tggtggccta 2640 cctgacaagc aagagactga acttaatggt ggattcttca aaggctgtat ccaagatgta 2700 agactaaaca accaaaatct ggaattcttt ccaaatccaa caaacaatgc atctctcaat 2760 ccagttcttg tcaatgtaac ccaaggctgt gctggagaca acagctgcaa gtccaacccc 2820 tgtcacaatg gaggtgtttg ccattcccgg tgggatgact tctcctgttc ctgtcctgcc 2880 ctcacaagtg ggaaagcctg tgaggaggtt cagtggtgtg gattcagccc gtgtcctcac 2940 ggagcccagt gccagccggt gcttcaagga tttgaatgta ttgcaaatgc tgtttttaat 3000 ggacaaagcg gtcaaatatt attcagaagc aatgggaata ttaccagaga actcaccaat 3060 atcacatttg gtttcagaac aagggatgca aatgtaataa tattgcatgc agaaaaagag 3120 cctgaatttc ttaatattag cattcaagat tccagattat tctttcaatt gcaaagtggc 3180 aacagctttt atatgctaag tctgacaagt ttgcagtcag tgaatgatgg cacatggcac 3240 gaagtgaccc tttccatgac agacccactg tcccagacct ccaggtggca aatggaagtg 3300 gacaacgaaa caccttttgt gaccagcaca attgctactg gaagcctcaa ctttttgaag 3360 gataatacag atatttatgt gggagacaga gctattgaca atataaaggg cctgcaaggg 3420 tgtctaagta caatagaaat cggaggcatt tatctctctt actttgaaaa tgttcatggt 3480 ttcattaata aacctcagga agagcaattt ctcaaaatct ctaccaattc agtggtcact 3540 ggctgtttgc agttaaatgt ctgcaactcc aacccctgtt tgcatggagg aaactgtgaa 3600 gacatctata gctcttatca ttgctcctgt cccttgggat ggtcagggaa acactgtgaa 3660 ctcaacatcg atgaatgctt ttcaaacccc tgtatccatg gcaactgctc tgacagagtt 3720 gcagcctacc actgcacatg tgagcctgga tacactggtg tgaactgtga agtggatata 3780 gacaactgcc agagtcacca gtgtgcaaat ggagccacct gcattagtca tactaatggc 3840 tattcttgcc tctgttttgg aaattttaca ggaaaatttt gcagacagag cagattaccc 3900 tcaacagtct gtgggaatga gaagacaaat ctcacttgct acaatggagg caactgcaca 3960 gagttccaga ctgaattaaa atgtatgtgc cggccaggtt ttactggaga atggtgtgaa 4020 aaggacattg atgagtgtgc ctctgatccg tgtgtcaatg gaggtctgtg ccaggactta 4080 ctcaacaaat tccagtgcct ctgtgatgtt gcctttgctg gcgagcgctg cgaggtggac 4140 ttggcagatg acttgatctc cgacattttc accactattg gctcagtgac tgtcgccttg 4200 ttactgatcc tcttgctggc cattgttgct tctgttgtca cctccaacaa aagggcaact 4260 cagggaacct acagccccag ccgtcaggag aaggagggct cccgagtgga aatgtggaac 4320 ttgatgccac cccctgcaat ggagagactg atttaggagc attgtgtccc ttcgagatgg 4380 ggatccacac actgtgaatg tgatgactgt acttcaggta tctctgacat acctgacaat 4440 gttaatctgc aactgggatt acactggaac tacaggaatg attcctttga ccaccttaaa 4500 aactttcaca gtggttccgc tcgacaccat tgttttatta tattatatca gccaattgca 4560 aaaaaagtct gtgccagtaa tttcagcctt ataattagca aaaacatctt ccagagaata 4620 aagtcttctg tggctttagt ggctatcact gaaactcttt cctcttttca acctgggaac 4680 aaattttagt tttcatttta ggtttctgta ctttctgtag tttctgtgta aactgccata 4740 tgtttacatg gaaactacag gaaaaaattg gctacatttc tcacttctcc tatcatgtgg 4800 tcaaagttat tgttgtatac cagcgatggg atgtatactt ttgtccttca ttcatggatt 4860 cagagaaagc tctgggaatg acttatggtc caaaaaagtg acccaatggc aacaaataaa 4920 aattgaaatg caaaaaaaaa aaaaaaaaaa 4950 3 4146 DNA Homo sapiens 3 aaataagacc atggcactta agaacattaa ctaccttctc atcttctacc tcagtttctc 60 actgcttatc tacataaaaa attccttttg caataaaaac aacaccaggt gcctctcaaa 120 ttcttgccaa aacaattcta catgcaaaga tttttcaaaa gacaatgatt gttcttgttc 180 agacacagcc aataatttgg acaaagactg tgacaacatg aaagaccctt gcttctccaa 240 tccctgtcaa ggaagtgcca cttgtgtgaa caccccagga gaaaggagct ttctgtgcaa 300 atgtcctcct gggtacagtg ggacaatctg tgaaactacc attggttcct gtggcaagaa 360 ctcctgccaa catggaggta tttgccatca ggaccctatt tatcctgtct gcatctgccc 420 tgctggatat gctggaagat tctgtgagat agatcacgat gagtgtgctt ccagcccttg 480 ccaaaatggg gccgtgtgcc aggatggaat tgatggttac tcctgcttct gtgtcccagg 540 atatcaaggc agacactgcg acttggaagt ggatgaatgt gcttcagatc cctgcaagaa 600 cgaggctaca tgcctcaatg aaataggaag atatacttgt atctgtcccc acaattattc 660 tggtgtaaac tgtgaattgg aaattgacga atgttggtcc cagccttgtt taaatggtgc 720 aacttgtcag gatgctctgg gggcctattt ctgcgactgt gcccctggat tcctggggga 780 tcactgtgaa ctcaacactg atgagtgtgc cagtcaacct tgtctccatg gagggctgtg 840 tgtggatgga gaaaacagat atagctgtaa ctgcacgggt agtggattca cagggacaca 900 ctgtgagacc ttgatgcctc tttgttggtc aaaaccttgt cacaataatg ctacatgtga 960 ggacagtgtt gacaattaca cttgtcactg ctggcctgga tacacaggtg cccagtgtga 1020 gatcgacctc aatgaatgca atagtaaccc ctgccagtcc aatggggaat gtgtggagct 1080 gtcctcagag aaacaatatg gacgcatcac tggactgcct tcttctttca gctaccatga 1140 agcctcaggt tatgtctgta tctgtcagcc tggattcaca ggaatccact gcgaagaaga 1200 cgtcaatgaa tgttcttcaa acccttgcca aaatggtggt acttgtgaga acttgcctgg 1260 gaattatact tgccattgcc catttgataa cctttctaga actttttatg gaggaaggga 1320 ctgttctgat attctcctgg gctgtaccca tcagcaatgt ctaaataatg gaacatgcat 1380 ccctcacttc caagatggcc agcatggatt cagctgcctg tgtccatctg gctacaccgg 1440 gtccctgtgt gaaatcgcaa ccacactttc atttgagggc gatggcttcc tgtgggtcaa 1500 aagtggctca gtgacaacca agggctcagt ttgtaacata gccctcaggt ttcagactgt 1560 tcagccaatg gctcttctac ttttccgaag caacagggat gtgtttgtga agctggagct 1620 gctaagtggc tacattcact tatcaattca ggtcaataat cagtcaaagg tgcttctgtt 1680 catttcccac aacaccagcg atggagagtg gcatttcgtg gaggtaatat ttgcagaggc 1740 tgtgaccctt accttaatcg acgactcctg taaggagaaa tgcatcgcga aagctcctac 1800 tccacttgaa agtgatcaat caatatgtgc ttttcagaac tcctttttgg gtggtttacc 1860 agtgggaatg accagcaatg gtgttgctct gcttaacttc tataatatgc catccacacc 1920 ttcgtttgta ggctgtctcc aagacattaa aattgattgg aatcacatta ccctggagaa 1980 catctcgtct ggctcatcat taaatgtcaa ggcaggctgt gtgagaaagg attggtgtga 2040 aagccaacct tgtcaaagca gaggacgctg catcaacttg tggctgagtt accagtgtga 2100 ctgccacagg ccctatgaag gccccaactg tctgagagag tatgtggcag gcagatttgg 2160 ccaggatgac tccactggtt atgtcatctt tactcttgat gagagctatg gagacaccat 2220 cagcctctcc atgtttgtcc gaacgcttca accatcaggc ttacttctag ctttggaaaa 2280 cagcacttat caatatatcc gtgtctggct agagcgcggc agactagcaa tgctgactcc 2340 aaactctccc aaattagtag taaaatttgt tcttaatgat ggaaatgtcc acttgatatc 2400 tttgaaaatc aagccatata aaattgaact gtatcagtct tcacaaaacc taggatttat 2460 ttctgcttct acgtggaaaa tcgaaaaggg agatgtcatc tacattggtg gcctacctga 2520 caagcaagag actgaactta atggtggatt cttcaaaggc tgtatccaag atgtaagact 2580 aaacaaccaa aatctggaat tctttccaaa tccaacaaac aatgcatctc tcaatccagt 2640 tcttgtcaat gtaacccaag gctgtgctgg agacaacagc tgcaagtcca acccctgtca 2700 caatggaggt gtttgccatt cccggtggga tgacttctcc tgttcctgtc ctgccctcac 2760 aagtgggaaa gcctgtgagg aggttcagtg gtgtggattc agcccgtgtc ctcacggagc 2820 ccagtgccag ccggtgcttc aaggatttga atgtattgca aatgctgttt ttaatggaca 2880 aagcggtcaa atattattca gaagcaatgg gaatattacc agagaactca ccaatatcac 2940 atttggtttc agaacaaggg atgcaaatgt aataatattg catgcagaaa aagagcctga 3000 atttcttaat attagcattc aagattccag attattcttt caattgcaaa gtggcaacag 3060 cttttatatg ctaagtctga caagtttgca gtcagtgaat gatggcacat ggcacgaagt 3120 gaccctttcc atgacagacc cactgtccca gacctccagg tggcaaatgg aagtggacaa 3180 cgaaacacct tttgtgacca gcacaattgc tactggaagc ctcaactttt tgaaggataa 3240 tacagatatt tatgtgggag acagagctat tgacaatata aagggcctgc aagggtgtct 3300 aagtacaata gaaatcggag gcatttatct ctcttacttt gaaaatgttc atggtttcat 3360 taataaacct caggaagagc aatttctcaa aatctctacc aattcagtgg tcactggctg 3420 tttgcagtta aatgtctgca actccaaccc ctgtttgcat ggaggaaact gtgaagacat 3480 ctatagctct tatcattgct cctgtccctt gggatggtca gggaaacact gtgaactcaa 3540 catcgatgaa tgcttttcaa acccctgtat ccatggcaac tgctctgaca gagttgcagc 3600 ctaccactgc acatgtgagc ctggatacac tggtgtgaac tgtgaagtgg atatagacaa 3660 ctgccagagt caccagtgtg caaatggagc cacctgcatt agtcatacta atggctattc 3720 ttgcctctgt tttggaaatt ttacaggaaa attttgcaga cagagcagat taccctcaac 3780 agtctgtggg aatgagaaga caaatctcac ttgctacaat ggaggcaact gcacagagtt 3840 ccagactgaa ttaaaatgta tgtgccggcc aggttttact ggagaatggt gtgaaaagga 3900 cattgatgag tgtgcctctg atccgtgtgt caatggaggt ctgtgccagg acttactcaa 3960 caaattccag tgcctctgtg atgttgcctt tgctggcgag cgctgcgagg tggacgtaag 4020 cagcctctcc ttttatgtct ctctcttatt ctggcagaat ctttttcagc ttctttctta 4080 cctcattttg cgtatgaatg acgagccagt tgttgagtgg ggtgaacagg aagattatta 4140 acatac 4146 4 1089 DNA Homo sapiens 4 accgacggac cgagggttcg agggagggac acggaccagg aacctgagct aggtcaaaga 60 cgcccgggcc aggtgccccg tcgcaggtgc ccctggccgg agatgcggta ggaggggcga 120 gcgcgagaag ccccttcctc ggcgctgcca acccgccacc cagcccatgg cgaaccccgg 180 gctggggctg cttctggcgc tgggcctgcc gttcctgctg gcccgctggg gccgagcctg 240 ggggcaaata cagaccactt ctgcaaatga gaatagcact gttttgcctt catccaccag 300 ctccagctcc gatggcaacc tgcgtccaga agccatcact gctatcatcg tggtcttctc 360 cctcttggct gccttgctcc tggctgtggg gctggcactg ttggtgcgga agcttcggga 420 gaagcggcag acggagggca cctaccggcc cagtagcgag gagcaggtgg gtgcccgcgt 480 gccaccgacc cccaacctca agttgccgcc ggaagagcgg ctcatctgaa cgctggggcc 540 tgctgcagcc accaacactg cccaggactg cgggttgctg gcttgtacac cgcagctgcc 600 accgagacac cagcctctga tggctcagga ggacttgtgg ggagaggctg ggggcaccca 660 tgtggtgggc tctgtgcagc atgttgcctc tgcttggctg tgcctgcagc tcagggtgct 720 ggggctcggg acccaccccc ctgcttgcgg aaccaacttt tctctgtgtg tccagcaggc 780 cccacaaccc cctctccttt ctttcagttc tcccatgcag ccgaggcccg ggcccctcag 840 gactccaagg agacggtgca gggctgcctg cccatctagg tcccctctcc tgcatctgtc 900 tcccttcatt gctgtgtgac cttggggaaa ggcagtgccc tctctgggca gtcagatcca 960 cccagtgctt aatagcaggg aagaaggtac ttcaaagact ctgcccctga ggtcaagaga 1020 ggatggggct attcactttt atatatttat ataaaattag tagtgagatg taacaaaaaa 1080 aaaaaaaaa 1089 5 741 DNA Homo sapiens 5 accgagggtt cgagggaggg acacggacca ggaacctgag ctaggtcaaa gacgcccggg 60 ccaggtgccc cgtcgcaggt gcccctggcc ggagatgcgg taggaggggc gagcgcgaga 120 agccccttcc tcggcgctgc caacccgcca cccagcccat ggcgaacccc gggctggggc 180 tgcttctggc gctgggcctg ccgttcctgc tggcccgctg gggccgagcc tgggggcaaa 240 tacagaccac ttctgcaaat gagaatagca ctgttttgcc ttcatccacc agctccagct 300 ccgatggcaa cctgcgtccg gaagccatca ctgctatcat cgtggtcttc tccctcttgg 360 ctgccttgct cctggctgtg gggctggcac tgttggtgcg gaagcttcgg gagaagcggc 420 agacggaggg cacctaccgg cccagtagcg aggagcagtt ctcccatgca gccgaggccc 480 gggcccctca ggactccaag gagacggtgc agggctgcct gcccatctag gtcccctctc 540 ctgcatctgt ctcccttcat tgctgtgtga ccttggggaa aggcagtgcc ctctctgggc 600 agtcagatcc acccagtgct taatagcagg gaagaaggta cttcaaagac tctgcccctg 660 aggtcaagag aggatggggc tattcacttt tatatattta tataaaatta gtagtgagat 720 gtaaaaaaaa aaaaaaaaaa a 741 6 516 DNA Homo sapiens 6 cagatatgct cgcaatgcag gtgtaaggtc cccctcacac ctgcgcgctt ccgcggtctc 60 cctccccgca tccccattaa gggactgggg tcccgttaca gcgaggctca ggtgcacaag 120 ccggaagtgc gctctcccag gtgccccgtc gcaggtgccc ctggccggag atgcggtagg 180 aggggcgagc gcgagaagcc ccttcctcgg cgctgccaac ccgccaccca gcccatggcg 240 aaccccgggc tggggctgct tctggcgctg ggcctgccgt tcctgctggc ccgctggggc 300 cgagcctggg ggcaacgtcc agaagccatc actgctatca tcgtggtctt ctccctcttg 360 gctgccttgc tcctggctgt ggggctggca ctgttggtgc ggaagcttcg ggagaagcgg 420 cagacggagg gcacctaccg gcccagtagc gaggaggtgg gtgcccgcgt gccaccgacc 480 cccaacctca agttgccgcc ggaagagcgg ctcatc 516 7 3921 DNA Homo sapiens 7 atggcgctgg ccaggcctgg gaccccggac ccccaggccc tggcctctgt cctgctactg 60 ctgctctggg cccctgccct ttccctcctg gctggaggta actccctgga actgtgctct 120 gagcccaaac tctcaagggt tggtcagtgc caggcacagg ggacggtgcc ttcagagccc 180 cccagtgcct gtgcctcaga cccgtgcgct ccagggaccg agtgccaggc taccgagagt 240 ggtggctata cctgtgggcc catggagccc cggggctgtg ccacccagcc atgccaccac 300 ggcgctctgt gtgtgcccca gggtccagat cccaccggct tccgctgcta ctgcgtgccg 360 ggtttccagg gcccacgctg cgagctggac atcgatgagt gtgcatcccg gccgtgccac 420 catggggcca cctgccgcaa cctggccgat cgctacgagt gccattgccc ccttggctat 480 gcaggcgtga cctgcgagat ggaggtggac gagtgcgcct cagcgccctg cctgcacggg 540 ggctcgtgcc tggacggcgt gggctccttc cgctgtgtgt gcgcgccagg ctacgggggc 600 acccgttgcc agctggacct cgacgagtgc cagagccagc cgtgcgcaca tgggggcacg 660 tgccacgacc tggtcaacgg gttccggtgc gactgcgcgg gcaccggcta cgagggcacg 720 cactgcgagc gggaggtgct ggagtgcgca tcggcgccct gcgagcacaa cgcgtcctgc 780 ctcgagggcc tcgggagctt ccgctgcctc tgttggccag gctacagcgg cgagctgtgc 840 gaggtggacg aggacgagtg tgcatcgagc ccctgccagc atgggggccg atgcctgcag 900 cgctctgacc cggccctcta cgggggtgtc caggccgcct tccctggcgc cttcagcttc 960 cgccatgctg cgggtttcct gtgccactgc cctcctggct ttgagggagc cgactgcggt 1020 gtggaggtgg acgagtgtgc ctcacggcca tgcctcaacg gaggccactg ccaggacctg 1080 cccaatggct tccagtgtca ctgcccagat ggctacgcag ggccgacatg tgaggaagat 1140 gtggatgaat gcctgtcgga tccctgcctg cacggcggaa cctgcagtga cactgtggca 1200 ggctatatct gcaggtgccc agagacctgg ggtgggcgcg actgttctgt gcagctcact 1260 ggctgccagg gccacacctg cccgctggct gccacctgca tccctatctt cgagtctggg 1320 gtccacagtt acgtctgcca ctgcccacct ggtacccatg gaccgttctg tggccagaat 1380 accaccttct ctgtgatggc tgggagcccc attcaggcat cagtgccagc tggtggcccc 1440 ctgggtctgg cactgaggtt tcgcaccaca ctgcccgctg ggaccttggc cactcgcaat 1500 gacaccaagg aaagcttgga gctggcattg gtggcagcca cacttcaggc cacactctgg 1560 agctacagca ccactgtgct tgtcctgaga ctgccggacc tggccctaaa cgatggccat 1620 tggcaccagg tggaggttgt gctccatcta gcgaccctgg agctacggct ctggcatgag 1680 ggctgccctg cccggctctg tgtggcctct ggtcctgtgg ccctggcttc cacggcttcg 1740 gcaactccgc tgcctgccgg gatctcctct gcccagctgg gggacgcgac ctttgcaggc 1800 tgcctccagg acgtgcgtgt ggatggccac ctcctgctgc ctgaggatct cggtgagaac 1860 gtcctcctgg gctgtgagcg ccgagagcag tgccggcctc tgccttgtgt ccacggaggg 1920 tcctgtgtgg atctgtggac tcatttccgt tgcgactgtg cccggcccca tagaggtccc 1980 acgtgcgctg atgagattcc tgctgccacc tttggcttgg gaggcgcccc aagctctgcc 2040 tcctttctgc tccaagagct gccaggtccc aacctcacag tgtctttcct tctccgcact 2100 cgggagtccg ctggcctgtt gctccagttt gccaatgact ccgcagctgg cctaacagta 2160 ttcctgagtg agggtcggat ccgggctgag gtgccgggca gtcctgctgt agtgctccct 2220 gggcgctggg atgatgggct ccgtcacctg gtgatgctca gcttcgggcc tgaccagctg 2280 caggacctgg ggcagcacgt gcacgtgggt gggaggctcc ttgctgccga cagccagccc 2340 tggggtgggc ccttccgagg ctgcctccag gacctgcgac tcgatggctg ccacctcccc 2400 ttctttcctc tgccactgga taactcaagc cagcccagcg agctcggcgg caggcagtcc 2460 tggaacctca ctgcgggctg cgtctccgag gacatgtgca gtcctgaccc ctgtttcaat 2520 ggtgggactt gcctcgtcac ctggaatgac ttccactgta cctgccctgc caatttcacg 2580 gggcctacgt gtgcccagca gctgtggtgt cccggccagc cctgtctccc acctgccacg 2640 tgtgaggagg tccctgatgg ctttgtgtgt gtggcggagg ccacgttccg cgagggtccc 2700 cccgccgcgt tcagcgggca caacgcgtcg tcagggcgct tgctcggcgg cctgtcgctg 2760 gcctttcgca cgcgcgactc cgaggcctgg ctgctgcgtg ccgcggcggg cgccctggaa 2820 ggcgtgtggc tggcggtgcg caatggctcg ctggcggggg gcgtgcgcgg aggccatggc 2880 ctgcccggcg ctgtgctgcc cataccgggg ccgcgcgtgg ccgatggtgc ctggcaccgc 2940 gtgcgtctgg ccatggagcg cccggcggcc accacctcgc gctggctgct gtggctggat 3000 ggtgccgcca ccccggtggc gctgcgcggc ctggccagtg acctgggctt cctgcagggc 3060 ccgggtgctg tgcgcatcct gctggctgag aacttcaccg gctgcttggg ccgcgtggcg 3120 ctgggcggcc tgcccctgcc cttggcgcgg ccccggcccg gcgcggcccc tggcgcccga 3180 gagcacttcg cgtcttggcc tgggacgccg gccccgatcc tcggctgccg cggcgcgccc 3240 gtgtgtgcgc cctcgccctg tctgcacgac ggtgcctgcc gtgacctctt cgacgccttt 3300 gcctgcgcct gcggcccggg gtgggaaggc ccgcgctgcg aagcccacgt cgacccctgt 3360 cactccgccc cctgcgcccg tggccgctgt cacacgcacc ccgacggccg cttcgagtgc 3420 cgctgcccgc ctggcttcgg gggcccgcgc tgcaggttgc ctgtcccatc caaggagtgc 3480 agcctgaatg tcacctgcct cgatggcagc ccatgtgagg gtggctctcc cgctgccaac 3540 tgcagctgcc tggagggtct tgctggccag aggtgtcagg tccccactct cccctgtgaa 3600 gccaacccct gcttgaatgg gggcacctgc cgggcagctg gaggggtgtc tgaatgtatc 3660 tgcaatgcca gattctccgg ccagttctgt gaagtggcga agggcctgcc cctgccgctg 3720 ccattcccac tgctggaggt ggccgtacct gcagcctgtg cctgcctcct cctcctcctc 3780 ctgggcctcc tttcagggat cctggcagcc cgaaagcgcc gccagtctga gggcacctac 3840 agcccaagcc agcaggaggt ggctggggcc cggctggaga tggacagtgt cctcaaggtg 3900 ccaccggagg agagactcat c 3921 8 3786 DNA Homo sapiens 8 atggcgctgg ccaggcctgg gaccccggac ccccaggccc tggcctctgt cctgctactg 60 ctgctctggg cccctgccct ttccctcctg gctgggacgg tgccttcaga gccccccagt 120 gcctgtgcct cagacccgtg cgctccaggg accgagtgcc aggctaccga gagtggtggc 180 tatacctgtg ggcccatgga gccccggggc tgtgccaccc agccatgcca ccacggcgct 240 ctgtgtgtgc cccagggtcc agatcccacc ggcttccgct gctactgcgt gccgggtttc 300 cagggcccac gctgcgagct ggacatcgat gagtgtgcat cccggccgtg ccaccatggg 360 gccacctgcc gcaacctggc cgatcgctac gagtgccatt gcccccttgg ctatgcaggc 420 gtgacctgcg agatggaggt ggacgagtgc gcctcagcgc cctgcctgca cgggggctcg 480 tgcctggacg gcgtgggctc cttccgctgt gtgtgcgcgc caggctacgg gggcacccgt 540 tgccagctgg acctcgacga gtgccagagc cagccgtgcg cacatggggg cacgtgccac 600 gacctggtca acgggttccg gtgcgactgc gcgggcaccg gctacgaggg cacgcactgc 660 gagcgggagg tgctggagtg cgcatcggcg ccctgcgagc acaacgcgtc ctgcctcgag 720 ggcctcggga gcttccgctg cctctgttgg ccaggctaca gcggcgagct gtgcgaggtg 780 gacgaggacg agtgtgcatc gagcccctgc cagcatgggg gccgatgcct gcagcgctct 840 gacccggccc tctacggggg tgtccaggcc gccttccctg gcgccttcag cttccgccat 900 gctgcgggtt tcctgtgcca ctgccctcct ggctttgagg gagccgactg cggtgtggag 960 gtggacgagt gtgcctcacg gccatgcctc aacggaggcc actgccagga cctgcccaat 1020 ggcttccagt gtcactgccc agatggctac gcagggccga catgtgagga agatgtggat 1080 gaatgcctgt cggatccctg cctgcacggc ggaacctgca gtgacactgt ggcaggctat 1140 atctgcaggt gcccagagac ctggggtggg cgcgactgtt ctgtgcagct cactggctgc 1200 cagggccaca cctgcccgct ggctgccacc tgcatcccta tcttcgagtc tggggtccac 1260 agttacgtct gccactgccc acctggtacc catggaccgt tctgtggcca gaataccacc 1320 ttctctgtga tggctgggag ccccattcag gcatcagtgc cagctggtgg ccccctgggt 1380 ctggcactga ggtttcgcac cacactgccc gctgggacct tggccactcg caatgacacc 1440 aaggaaagct tggagctggc attggtggca gccacacttc aggccacact ctggagctac 1500 agcaccactg tgcttgtcct gagactgccg gacctggccc taaacgatgg ccattggcac 1560 caggtggagg ttgtgctcca tctagcgacc ctggagctac ggctctggca tgagggctgc 1620 cctgcccggc tctgtgtggc ctctggtcct gtggccctgg cttccacggc ttcggcaact 1680 ccgctgcctg ccgggatctc ctctgcccag ctgggggacg cgacctttgc aggctgcctc 1740 caggacgtgc gtgtggatgg ccacctcctg ctgcctgagg atctcggtga gaacgtcctc 1800 ctgggctgtg agcgccgaga gcagtgccgg cctctgcctt gtgtccacgg agggtcctgt 1860 gtggatctgt ggactcattt ccgttgcgac tgtgcccggc cccatagagg tcccacgtgc 1920 gctgatgaga ttcctgctgc cacctttggc ttgggaggcg ccccaagctc tgcctccttt 1980 ctgctccaag agctgccagg tcccaacctc acagtgtctt tccttctccg cactcgggag 2040 tccgctggcc tgttgctcca gtttgccaat gactccgcag ctggcctaac agtattcctg 2100 agtgagggtc ggatccgggc tgaggtgccg ggcagtcctg ctgtagtgct ccctgggcgc 2160 tgggatgatg ggctccgtca cctggtgatg ctcagcttcg ggcctgacca gctgcaggac 2220 ctggggcagc acgtgcacgt gggtgggagg ctccttgctg ccgacagcca gccctggggt 2280 gggcccttcc gaggctgcct ccaggacctg cgactcgatg gctgccacct ccccttcttt 2340 cctctgccac tggataactc aagccagccc agcgagctcg gcggcaggca gtcctggaac 2400 ctcactgcgg gctgcgtctc cgaggacatg tgcagtcctg acccctgttt caatggtggg 2460 acttgcctcg tcacctggaa tgacttccac tgtacctgcc ctgccaattt cacggggcct 2520 acgtgtgccc agcagctgtg gtgtcccggc cagccctgtc tcccacctgc cacgtgtgag 2580 gaggtccctg atggctttgt gtgtgtggcg gaggccacgt tccgcgaggg tccccccgcc 2640 gcgttcagcg ggcacaacgc gtcgtcaggg cgcttgctcg gcggcctgtc gctggccttt 2700 cgcacgcgcg actccgaggc ctggctgctg cgtgccgcgg cgggcgccct ggaaggcgtg 2760 tggctggcgg tgcgcaatgg ctcgctggcg gggggcgtgc gcggaggcca tggcctgccc 2820 ggcgctgtgc tgcccatacc ggggccgcgc gtggccgatg gtgcctggca ccgcgtgcgt 2880 ctggccatgg agcgcccggc ggccaccacc tcgcgctggc tgctgtggct ggatggtgcc 2940 gccaccccgg tggcgctgcg cggcctggcc agtgacctgg gcttcctgca gggcccgggt 3000 gctgtgcgca tcctgctggc tgagaacttc accggctgct tgggccgcca cttcgcgtct 3060 tggcctggga cgccggcccc gatcctcggc tgccgcggcg cgcccgtgtg tgcgccctcg 3120 ccctgtctgc acgacggtgc ctgccgtgac ctcttcgacg cctttgcctg cgcctgcggc 3180 ccggggtggg aaggcccgcg ctgcgaagcc cacgtcgacc cctgtcactc cgccccctgc 3240 gcccgtggcc gctgtcacac gcaccccgac ggccgcttcg agtgccgctg cccgcctggc 3300 ttcgggggcc cgcgctgcag gttgcctgtc ccatccaagg agtgcagcct gaatgtcacc 3360 tgcctcgatg gcagcccatg tgagggtggc tctcccgctg ccaactgcag ctgcctggag 3420 ggtcttgctg gccagaggtg tcaggtcccc actctcccct gtgaagccaa cccctgcttg 3480 aatgggggca cctgccgggc agctggaggg gtgtctgaat gtatctgcaa tgccagattc 3540 tccggccagt tctgtgaagt ggcgaagggc ctgcccctgc cgctgccatt cccactgctg 3600 gaggtggccg tacctgcagc ctgtgcctgc ctcctcctcc tcctcctggg cctcctttca 3660 gggatcctgg cagcccgaaa gcgccgccag tctgagggca cctacagccc aagccagcag 3720 gaggtggctg gggcccggct ggagatggac agtgtcctca aggtgccacc ggaggagaga 3780 ctcatc 3786 9 3612 DNA Homo sapiens 9 tgtgccaccc agccatgcca ccacggcgct ctgtgtgtgc cccagggtcc agatcccacc 60 ggcttccgct gctactgcgt gccgggtttc cagggcccac gctgcgagct ggacatcgat 120 gagtgtgcat cccggccgtg ccaccatggg gccacctgcc gcaacctggc cgatcgctac 180 gagtgccatt gcccccttgg ctatgcaggc gtgacctgcg agatggaggt ggacgagtgc 240 gcctcagcgc cctgcctgca cgggggctcg tgcctggacg gcgtgggctc cttccgctgt 300 gtgtgcgcgc caggctacgg gggcacccgt tgccagctgg acctcgacga gtgccagagc 360 cagccgtgcg cacatggggg cacgtgccac gacctggtca acgggttccg gtgcgactgc 420 gcgggcaccg gctacgaggg cacgcactgc gagcgggagg tgctggagtg cgcatcggcg 480 ccctgcgagc acaacgcgtc ctgcctcgag ggcctcggga gcttccgctg cctctgttgg 540 ccaggctaca gcggcgagct gtgcgaggtg gacgaggacg agtgtgcatc gagcccctgc 600 cagcatgggg gccgatgcct gcagcgctct gacccggccc tctacggggg tgtccaggcc 660 gccttccctg gcgccttcag cttccgccat gctgcgggtt tcctgtgcca ctgccctcct 720 ggctttgagg ggccgacatg tgaggaagat gtggatgaat gcctgtcgga tccctgcctg 780 cacggcggaa cctgcagtga cactgtggca ggctatatct gcaggtgccc agagacctgg 840 ggtgggcgcg actgttctgt gcagctcact ggctgccagg gccacacctg cccgctggct 900 gccacctgca tccctatctt cgagtctggg gtccacagtt acgtctgcca ctgcccacct 960 ggtacccatg gaccgttctg tggccagaat accaccttct ctgtgatggc tgggagcccc 1020 attcaggcat cagtgccagc tggtggcccc ctgggtctgg cactgaggtt tcgcaccaca 1080 ctgcccgctg ggaccttggc cactcgcaat gacaccaagg aaagcttgga gctggcattg 1140 gtggcagcca cacttcaggc cacactctgg agctacagca ccactgtgct tgtcctgaga 1200 ctgccggacc tggccctaaa cgatggccat tggcaccagg tggaggttgt gctccatcta 1260 gcgaccctgg agctacggct ctggcatgag ggctgccctg cccggctctg tgtggcctct 1320 ggtcctgtgg ccctggcttc cacggcttcg gcaactccgc tgcctgccgg gatctcctct 1380 gcccagctgg gggacgcgac ctttgcaggc tgcctccagg acgtgcgtgt ggatggccac 1440 ctcctgctgc ctgaggatct cggtgagaac gtcctcctgg gctgtgagcg ccgagagcag 1500 tgccggcctc tgccttgtgt ccacggaggg tcctgtgtgg atctgtggac tcatttccgt 1560 tgcgactgtg cccggcccca tagaggtccc acgtgcgctg atgagattcc tgctgccacc 1620 tttggcttgg gaggcgcccc aagctctgcc tcctttctgc tccaagagct gccaggtccc 1680 aacctcacag tgtctttcct tctccgcact cgggagtccg ctggcctgtt gctccagttt 1740 gccaatgact ccgcagctgg cctaacagta ttcctgagtg agggtcggat ccgggctgag 1800 gtgccgggca gtcctgctgt agtgctccct gggcgctggg atgatgggct ccgtcacctg 1860 gtgatgctca gcttcgggcc tgaccagctg caggacctgg ggcagcacgt gcacgtgggt 1920 gggaggctcc ttgctgccga cagccagccc tggggtgggc ccttccgagg ctgcctccag 1980 gacctgcgac tcgatggctg ccacctcccc ttctttcctc tgccactgga taactcaagc 2040 cagcccagcg agctcggcgg caggcagtcc tggaacctca ctgcgggctg cgtctccgag 2100 gacatgtgca gtcctgaccc ctgtttcaat ggtgggactt gcctcgtcac ctggaatgac 2160 ttccactgta cctgccctgc caatttcacg gggcctacgt gtgcccagca gctgtggtgt 2220 cccggccagc cctgtctccc acctgccacg tgtgaggagg tccctgatgg ctttgtgtgt 2280 gtggcggagg ccacgttccg cgagggtccc cccgccgcgt tcagcgggca caacgcgtcg 2340 tcagggcgct tgctcggcgg cctgtcgctg gcctttcgca cgcgcgactc cgaggcctgg 2400 ctgctgcgtg ccgcggcggg cgccctggaa ggcgtgtggc tggcggtgcg caatggctcg 2460 ctggcggggg gcgtgcgcgg aggccatggc ctgcccggcg ctgtgctgcc cataccgggg 2520 ccgcgcgtgg ccgatggtgc ctggcaccgc gtgcgtctgg ccatggagcg cccggcggcc 2580 accacctcgc gctggctgct gtggctggat ggtgccgcca ccccggtggc gctgcgcggc 2640 ctggccagtg acctgggctt cctgcagggc ccgggtgctg tgcgcatcct gctggctgag 2700 aacttcaccg gctgcttggg ccgcgtggcg ctgggcggcc tgcccctgcc cttggcgcgg 2760 ccccggcccg gcgcggcccc tggcgcccga gagcacttcg cgtcttggcc tgggacgccg 2820 gccccgatcc tcggctgccg cggcgcgccc gtgtgtgcgc cctcgccctg tctgcacgac 2880 ggtgcctgcc gtgacctctt cgacgccttt gcctgcgcct gcggcccggg gtgggaaggc 2940 ccgcgctgcg aagcccacgt cgacccctgt cactccgccc cctgcgcccg tggccgctgt 3000 cacacgcacc ccgacggccg cttcgagtgc cgctgcccgc ctggcttcgg gggcccgcgc 3060 tgcagaacac aagctcaggt ggcccacggt cacagcgctg gggagggcag gtcccaggtg 3120 tcctgcaccc actccagcct ctgctctctc cccaggttgc ctgtcccatc caaggagtgc 3180 agcctgaatg tcacctgcct cgatggcagc ccatgtgagg gtggctctcc cgctgccaac 3240 tgcagctgcc tggagggtct tgctggccag aggtgtcagg tccccactct cccctgtgaa 3300 gccaacccct gcttgaatgg gggcacctgc cgggcagctg gaggggtgtc tgaatgtatc 3360 tgcaatgcca gattctccgg ccagttctgt gaaggcctgc ccctgccgct gccattccca 3420 ctgctggagg tggccgtacc tgcagcctgt gcctgcctcc tcctcctcct cctgggcctc 3480 ctttcaggga tcctggcagc ccgaaagcgc cgccagtctg agggcaccta cagcccaagc 3540 cagcaggagg tggctggggc ccggctggag atggacagtg tcctcaaggt gccaccggag 3600 gagagactca tc 3612 10 3825 DNA Homo sapiens 10 ggtaactccc tggaactgtg ctctgagccc aaactctcaa gggttggtca gtgccaggca 60 caggggacgg tgccttcaga gccccccagt gcctgtgcct cagacccgtg cgctccaggg 120 accgagtgcc aggctaccga gagtggtggc tatacctgtg ggcccatgga gccccggggc 180 tgtgccaccc agccatgcca ccacggcgct ctgtgtgtgc cccagggtcc agatcccacc 240 ggcttccgct gctactgcgt gccgggtttc cagggcccac gctgcgagct ggacatcgat 300 gagtgtgcat cccggccgtg ccaccatggg gccacctgcc gcaacctggc cgatcgctac 360 gagtgccatt gcccccttgg ctatgcaggc gtgacctgcg agatggaggt ggacgagtgc 420 gcctcagcgc cctgcctgca cgggggctcg tgcctggacg gcgtgggctc cttccgctgt 480 gtgtgcgcgc caggctacgg gggcacccgt tgccagctgg acctcgacga gtgccagagc 540 cagccgtgcg cacatggggg cacgtgccac gacctggtca acgggttccg gtgcgactgc 600 gcgggcaccg gctacgaggg cacgcactgc gagcgggagg tgctggagtg cgcatcggcg 660 ccctgcgagc acaacgcgtc ctgcctcgag ggcctcggga gcttccgctg cctctgttgg 720 ccaggctaca gcggcgagct gtgcgaggtg gacgaggacg agtgtgcatc gagcccctgc 780 cagcatgggg gccgatgcct gcagcgctct gacccggccc tctacggggg tgtccaggcc 840 gccttccctg gcgccttcag cttccgccat gctgcgggtt tcctgtgcca ctgccctcct 900 ggctttgagg gagccgactg cggtgtggag gtggacgagt gtgcctcacg gccatgcctc 960 aacggaggcc actgccagga cctgcccaat ggcttccagt gtcactgccc agatggctac 1020 gcagggccga catgtgagga agatgtggat gaatgcctgt cggatccctg cctgcacggc 1080 ggaacctgca gtgacactgt ggcaggctat atctgcaggt gcccagagac ctggggtggg 1140 cgcgactgtt ctgtgcagct cactggctgc cagggccaca cctgcccgct ggctgccacc 1200 tgcatcccta tcttcgagtc tggggtccac agttacgtct gccactgccc acctggtacc 1260 catggaccgt tctgtggcca gaataccacc ttctctgtga tggctgggag ccccattcag 1320 gcatcagtgc cagctggtgg ccccctgggt ctggcactga ggtttcgcac cacactgccc 1380 gctgggacct tggccactcg caatgacacc aaggaaagct tggagctggc attggtggca 1440 gccacacttc aggccacact ctggagctac agcaccactg tgcttgtcct gagactgccg 1500 gacctggccc taaacgatgg ccattggcac caggtggagg ttgtgctcca tctagcgacc 1560 ctggagctac ggctctggca tgagggctgc cctgcccggc tctgtgtggc ctctggtcct 1620 gtggccctgg cttccacggc ttcggcaact ccgctgcctg ccgggatctc ctctgcccag 1680 ctgggggacg cgacctttgc aggctgcctc caggacgtgc gtgtggatgg ccacctcctg 1740 ctgcctgagg atctcggtga gaacgtcctc ctgggctgtg agcgccgaga gcagtgccgg 1800 cctctgcctt gtgtccacgg agggtcctgt gtggatctgt ggactcattt ccgttgcgac 1860 tgtgcccggc cccatagagg tcccacgtgc gctgatgaga ttcctgctgc cacctttggc 1920 ttgggaggcg ccccaagctc tgcctccttt ctgctccaag agctgccagg tcccaacctc 1980 acagtgtctt tccttctccg cactcgggag tccgctggcc tgttgctcca gtttgccaat 2040 gactccgcag ctggcctaac agtattcctg agtgagggtc ggatccgggc tgaggtgccg 2100 ggcagtcctg ctgtagtgct ccctgggcgc tgggatgatg ggctccgtca cctggtgatg 2160 ctcagcttcg ggcctgacca gctgcaggac ctggggcagc acgtgcacgt gggtgggagg 2220 ctccttgctg ccgacagcca gccctggggt gggcccttcc gaggctgcct ccaggacctg 2280 cgactcgatg gctgccacct ccccttcttt cctctgccac tggataactc aagccagccc 2340 agcgagctcg gcggcaggca gtcctggaac ctcactgcgg gctgcgtctc cgaggacatg 2400 tgcagtcctg acccctgttt caatggtggg acttgcctcg tcacctggaa tgacttccac 2460 tgtacctgcc ctgccaattt cacggggcct acgtgtgccc agcagctgtg gtgtcccggc 2520 cagccctgtc tcccacctgc cacgtgtgag gaggtccctg atggctttgt gtgtgtggcg 2580 gaggccacgt tccgcgaggg tccccccgcc gcgttcagcg ggcacaacgc gtcgtcaggg 2640 cgcttgctcg gcggcctgtc gctggccttt cgcacgcgcg actccgaggc ctggctgctg 2700 cgtgccgcgg cgggcgccct ggaaggcgtg tggctggcgg tgcgcaatgg ctcgctggcg 2760 gggggcgtgc gcggaggcca tggcctgccc ggcgctgtgc tgcccatacc ggggccgcgc 2820 gtggccgatg gtgcctggca ccgcgtgcgt ctggccatgg agcgcccggc ggccaccacc 2880 tcgcgctggc tgctgtggct ggatggtgcc gccaccccgg tggcgctgcg cggcctggcc 2940 agtgacctgg gcttcctgca gggcccgggt gctgtgcgca tcctgctggc tgagaacttc 3000 accggctgct tgggccgcgt ggcgctgggc ggcctgcccc tgcccttggc gcggccccgg 3060 cccggcgcgg cccctggcgc ccgagagcac ttcgcgtctt ggcctgggac gccggccccg 3120 atcctcggct gccgcggcgc gcccgtgtgt gcgccctcgc cctgtctgca cgacggtgcc 3180 tgccgtgacc tcttcgacgc ctttgcctgc gcctgcggcc cggggtggga aggcccgcgc 3240 tgcgaagccc acgtcgaccc ctgtcactcc gccccctgcg cccgtggccg ctgtcacacg 3300 caccccgacg gccgcttcga gtgccgctgc ccgcctggct tcgggggccc gcgctgcagg 3360 ttgcctgtcc catccaagga gtgcagcctg aatgtcacct gcctcgatgg cagcccatgt 3420 gagggtggct ctcccgctgc caactgcagc tgcctggagg gtcttgctgg ccagaggtgt 3480 caggtcccca ctctcccctg tgaagccaac ccctgcttga atgggggcac ctgccgggca 3540 gctggagggg tgtctgaatg tatctgcaat gccagattct ccggccagtt ctgtgaagtg 3600 gcgaagggcc tgcccctgcc gctgccattc ccactgctgg aggtggccgt acctgcagcc 3660 tgtgcctgcc tcctcctcct cctcctgggc ctcctttcag ggatcctggc agcccgaaag 3720 cgccgccagt ctgagggcac ctacagccca agccagcagg aggtggctgg ggcccggctg 3780 gagatggaca gtgtcctcaa ggtgccaccg gaggagagac tcatc 3825 11 453 DNA Homo sapiens 11 aatggtggga cttgcctcgt cacctggaat gacttccact gtacctgccc tgccaatttc 60 acggggccta cgtgtgccca gcagctgtgg tgtcccggcc agccctgtct cccacctgcc 120 acgtgtgagg aggtccctga tggctttgtg tgtgtggcgg aggccacgtt ccgcgagggt 180 ccccccgccg cgttcagcgg gcacaacgcg tcgtcagggc gcttgctcgg cggcctgtcg 240 ctggcctttc gcacgcgcga ctccgaggcc tggctgctgc gtgccgcggc gggcgccctg 300 gaaggcgtgt ggctggcggt gcgcaatggc tcgctggcgg ggggcgtgcg cggaggccat 360 ggcctgcccg gcgctgtgct gcccataccg gggccgcgcg tggccgatgg tgcctggcac 420 cgcgtgcgtc tggccatgga gcgcccggcg gcc 453 12 583 DNA Homo sapiens 12 gcggccgccg ggcgctccat ggccagacgc acgcggtgcc aggcaccatc ggccacgcgc 60 ggccccggta tgggcagcac agcgccgggc aggccatggc ctccgcgcac gccccccgcc 120 agcgagccat tgcgcaccgc cagccacacg ccttccaggg cgcccgccgc ggcacgcagc 180 agccaggcct cggagtcgcg cgtgcgaaag gccagcgaca ggccgccgag caagcgccct 240 gacgacgcgt tgtgcccgct gaacgcggcg gggggaccct tgcggaacgt ggcctccgcc 300 acacctgcgg agagaaggag cctcagggaa tagcccacgg agccttttac agaaatgcca 360 cctccccact ttcccctgga gtgggagaaa accccttact tggggggggg cactcgcccc 420 cactttttta ttaaatctgg aaactgtacc gccccaccac cctcatttta ctccaatgga 480 tttcctacca tggagagcat tccggccagt gctgggggtg agcgggatgc ctagcgacct 540 tcaggtacaa gccagcaact agaattcctg cagaatctgc ccc 583 13 2273 DNA Homo sapiens 13 ctgcgaagcc cacgtcgacc cctgtcactc cgccccctgc gcccgtggcc gctgtcacac 60 gcaccccgac ggccgcttcg agtgccgctg cccgcctggc ttcgggggcc cgcgctgcag 120 gttgcctgtc ccatccaagg agtgcagcct gaatgtcacc tgcctcgatg gcagcccatg 180 tgagggtggc tctcccgctg ccaactgcag ctgcctggag ggtcttgctg gccagaggtg 240 tcaggtcccc actctcccct gtgaagccaa cccctgcttg aatgggggca cctgccgggc 300 agctggaggg gtgtctgaat gtatctgcaa tgccagattc tccggccagt tctgtgaagt 360 ggcgaagggc ctgcccctgc cgctgccatt cccactgctg gaggtggccg tacctgcagc 420 ctgtgcctgc ctcctcctcc tcctcctggg cctcctttca gggatcctgg cagcccgaaa 480 gcgccgccag tctgagggca cctacagccc aagccagcag gaggtggctg gggcccggct 540 ggagatggac agtgtcctca aggtgccacc ggaggagaga ctcatctagg ccagcctggc 600 tgctggcacc agcacctgga ggtcctgaat ggtttctacc tggagaccca aggaagctgc 660 ttccagggct cgggacattg ctacggaagt gtccccttgg ctggcagcct ctgcctctgc 720 ctctgcccca tcctggatgg aggacgaggg gagcaactca gggaaacaga ggcctagaga 780 ggctgcggac ttctccatcc caccctcggg gttccgcctt ggcaggtgta cggctgtgcg 840 tgggagggca cacgtgggtt cacagtgtgt tcaggagtgt gtgtatctgg aggagtgtgt 900 gtgtgagtgt gtacctgggc ctgtgttagt ctgcagatgc tagtgtgagt gtgtcctgac 960 atggctccag ggcgtgtctg ccgtgtttac tgtgtgtcta tgactgtgat gggtgtagct 1020 gatcccagga ggtggcggct gcgccatggg gtcaaccatt acagtcctag ggcaggggcg 1080 gcccaaggct gcatgttctc caggaggcca ggccggggtt gcccaggcac ctccttcccc 1140 gcctctgggg gctgctcctg ctgtggaggc agctgggaag tcagggaagg ccactagcag 1200 aggctgagtg ggcttctggc tcctagaaca aatgtccctt caggcaggtc tgtctgccag 1260 aggccagagc cagtcatgcg agggaaccac agacccaccc gccccctcag ccggagcagc 1320 ccgagggagc agaggagggg ctgcctggag cttcccaccc tgctgtggtc atttgtcaaa 1380 gggggaaggc acccactgcc tacctcacag ggctgttgtg aggatcagag aggacgacag 1440 tggggaaaga atctggaagt cttcaactgc cgtctgatgg gaaggaccgt ctgggtgtcc 1500 ttctgggatg aggatgacag agcaaccctt ctcctgccct gaaccccccc agctcacctg 1560 accacctctg gttctccagc tccggtcctt cctagcagcc tggtgagctc actccttccc 1620 ctgatgactg gctgcctcta cacagactcg gcgagaggac ttgaaggaag ccctctgggt 1680 tgtctgctga gtacaggggc tcagtgaaca ctggcgctgc ctctgagtcg gggctgggcc 1740 tgcagaggcc gactcagagg agactctgct gcttgctccc agccccttcc ccggcgatgc 1800 ccatcacact gtgacctccc atccctgaag ggcacctgcc tgagggcctg gcctccttcc 1860 agcttcatgg acctggagat gtgccctttc atccttcctg cttcccaggc cagtagatcc 1920 gtttacactt ttgggtcgac agtcagcttt tccttttggt tttggcgggt cccagaggca 1980 tgggtgtcca gtccaatgtg gggagccacg tgacaacgtg ggggactggg acatgggact 2040 gggaagtcag cagacgctgg gatagagagg gccctgaaca ccaggctcag gggcttgctt 2100 ggtccttatc ctgtaggagg tgggaccacc tttccctgaa ctttctctac aacccttggg 2160 agcgtgggga ggaggcggct ggttccaggg tcagtttact aagttagaga tttggaaaac 2220 ctgtgtcagc tgtaactcct aggatatttt atgtggaacc taacatgcag atg 2273 14 1376 PRT Homo sapiens 14 Met Ala Leu Lys Asn Ile Asn Tyr Leu Leu Ile Phe Tyr Leu Ser Phe 1 5 10 15 Ser Leu Leu Ile Tyr Ile Lys Asn Ser Phe Cys Asn Lys Asn Asn Thr 20 25 30 Arg Cys Leu Ser Asn Ser Cys Gln Asn Asn Ser Thr Cys Lys Asp Phe 35 40 45 Ser Lys Asp Asn Asp Cys Ser Cys Ser Asp Thr Ala Asn Asn Leu Asp 50 55 60 Lys Asp Cys Asp Asn Met Lys Asp Pro Cys Phe Ser Asn Pro Cys Gln 65 70 75 80 Gly Ser Ala Thr Cys Val Asn Thr Pro Gly Glu Arg Ser Phe Leu Cys 85 90 95 Lys Cys Pro Pro Gly Tyr Ser Gly Thr Ile Cys Glu Thr Thr Ile Gly 100 105 110 Ser Cys Gly Lys Asn Ser Cys Gln His Gly Gly Ile Cys His Gln Asp 115 120 125 Pro Ile Tyr Pro Val Cys Ile Cys Pro Ala Gly Tyr Ala Gly Arg Phe 130 135 140 Cys Glu Ile Asp His Asp Glu Cys Ala Ser Ser Pro Cys Gln Asn Gly 145 150 155 160 Ala Val Cys Gln Asp Gly Ile Asp Gly Tyr Ser Cys Phe Cys Val Pro 165 170 175 Gly Tyr Gln Gly Arg His Cys Asp Leu Glu Val Asp Glu Cys Ala Ser 180 185 190 Asp Pro Cys Lys Asn Glu Ala Thr Cys Leu Asn Glu Ile Gly Arg Tyr 195 200 205 Thr Cys Ile Cys Pro His Asn Tyr Ser Gly Val Asn Cys Glu Leu Glu 210 215 220 Ile Asp Glu Cys Trp Ser Gln Pro Cys Leu Asn Gly Ala Thr Cys Gln 225 230 235 240 Asp Ala Leu Gly Ala Tyr Phe Cys Asp Cys Ala Pro Gly Phe Leu Gly 245 250 255 Asp His Cys Glu Leu Asn Thr Asp Glu Cys Ala Ser Gln Pro Cys Leu 260 265 270 His Gly Gly Leu Cys Val Asp Gly Glu Asn Arg Tyr Ser Cys Asn Cys 275 280 285 Thr Gly Ser Gly Phe Thr Gly Thr His Cys Glu Thr Leu Met Pro Leu 290 295 300 Cys Trp Ser Lys Pro Cys His Asn Asn Ala Thr Cys Glu Asp Ser Val 305 310 315 320 Asp Asn Tyr Thr Cys His Cys Trp Pro Gly Tyr Thr Gly Ala Gln Cys 325 330 335 Glu Ile Asp Leu Asn Glu Cys Asn Ser Asn Pro Cys Gln Ser Asn Gly 340 345 350 Glu Cys Val Glu Leu Ser Ser Glu Lys Gln Tyr Gly Arg Ile Thr Gly 355 360 365 Leu Pro Ser Ser Phe Ser Tyr His Glu Ala Ser Gly Tyr Val Cys Ile 370 375 380 Cys Gln Pro Gly Phe Thr Gly Ile His Cys Glu Glu Asp Val Asn Glu 385 390 395 400 Cys Ser Ser Asn Pro Cys Gln Asn Gly Gly Thr Cys Glu Asn Leu Pro 405 410 415 Gly Asn Tyr Thr Cys His Cys Pro Phe Asp Asn Leu Ser Arg Thr Phe 420 425 430 Tyr Gly Gly Arg Asp Cys Ser Asp Ile Leu Leu Gly Cys Thr His Gln 435 440 445 Gln Cys Leu Asn Asn Gly Thr Cys Ile Pro His Phe Gln Asp Gly Gln 450 455 460 His Gly Phe Ser Cys Leu Cys Pro Ser Gly Tyr Thr Gly Ser Leu Cys 465 470 475 480 Glu Ile Ala Thr Thr Leu Ser Phe Glu Gly Asp Gly Phe Leu Trp Val 485 490 495 Lys Ser Gly Ser Val Thr Thr Lys Gly Ser Val Cys Asn Ile Ala Leu 500 505 510 Arg Phe Gln Thr Val Gln Pro Met Ala Leu Leu Leu Phe Arg Ser Asn 515 520 525 Arg Asp Val Phe Val Lys Leu Glu Leu Leu Ser Gly Tyr Ile His Leu 530 535 540 Ser Ile Gln Val Asn Asn Gln Ser Lys Val Leu Leu Phe Ile Ser His 545 550 555 560 Asn Thr Ser Asp Gly Glu Trp His Phe Val Glu Val Ile Phe Ala Glu 565 570 575 Ala Val Thr Leu Thr Leu Ile Asp Asp Ser Cys Lys Glu Lys Cys Ile 580 585 590 Ala Lys Ala Pro Thr Pro Leu Glu Ser Asp Gln Ser Ile Cys Ala Phe 595 600 605 Gln Asn Ser Phe Leu Gly Gly Leu Pro Val Gly Met Thr Ser Asn Gly 610 615 620 Val Ala Leu Leu Asn Phe Tyr Asn Met Pro Ser Thr Pro Ser Phe Val 625 630 635 640 Gly Cys Leu Gln Asp Ile Lys Ile Asp Trp Asn His Ile Thr Leu Glu 645 650 655 Asn Ile Ser Ser Gly Ser Ser Leu Asn Val Lys Ala Gly Cys Val Arg 660 665 670 Lys Asp Trp Cys Glu Ser Gln Pro Cys Gln Ser Arg Gly Arg Cys Ile 675 680 685 Asn Leu Trp Leu Ser Tyr Gln Cys Asp Cys His Arg Pro Tyr Glu Gly 690 695 700 Pro Asn Cys Leu Arg Glu Tyr Val Ala Gly Arg Phe Gly Gln Asp Asp 705 710 715 720 Ser Thr Gly Tyr Val Ile Phe Thr Leu Asp Glu Ser Tyr Gly Asp Thr 725 730 735 Ile Ser Leu Ser Met Phe Val Arg Thr Leu Gln Pro Ser Gly Leu Leu 740 745 750 Leu Ala Leu Glu Asn Ser Thr Tyr Gln Tyr Ile Arg Val Trp Leu Glu 755 760 765 Arg Gly Arg Leu Ala Met Leu Thr Pro Asn Ser Pro Lys Leu Val Val 770 775 780 Lys Phe Val Leu Asn Asp Gly Asn Val His Leu Ile Ser Leu Lys Ile 785 790 795 800 Lys Pro Tyr Lys Ile Glu Leu Tyr Gln Ser Ser Gln Asn Leu Gly Phe 805 810 815 Ile Ser Ala Ser Thr Trp Lys Ile Glu Lys Gly Asp Val Ile Tyr Ile 820 825 830 Gly Gly Leu Pro Asp Lys Gln Glu Thr Glu Leu Asn Gly Gly Phe Phe 835 840 845 Lys Gly Cys Ile Gln Asp Val Arg Leu Asn Asn Gln Asn Leu Glu Phe 850 855 860 Phe Pro Asn Pro Thr Asn Asn Ala Ser Leu Asn Pro Val Leu Val Asn 865 870 875 880 Val Thr Gln Gly Cys Ala Gly Asp Asn Ser Cys Lys Ser Asn Pro Cys 885 890 895 His Asn Gly Gly Val Cys His Ser Arg Trp Asp Asp Phe Ser Cys Ser 900 905 910 Cys Pro Ala Leu Thr Ser Gly Lys Ala Cys Glu Glu Val Gln Trp Cys 915 920 925 Gly Phe Ser Pro Cys Pro His Gly Ala Gln Cys Gln Pro Val Leu Gln 930 935 940 Gly Phe Glu Cys Ile Ala Asn Ala Val Phe Asn Gly Gln Ser Gly Gln 945 950 955 960 Ile Leu Phe Arg Ser Asn Gly Asn Ile Thr Arg Glu Leu Thr Asn Ile 965 970 975 Thr Phe Gly Phe Arg Thr Arg Asp Ala Asn Val Ile Ile Leu His Ala 980 985 990 Glu Lys Glu Pro Glu Phe Leu Asn Ile Ser Ile Gln Asp Ser Arg Leu 995 1000 1005 Phe Phe Gln Leu Gln Ser Gly Asn Ser Phe Tyr Met Leu Ser Leu 1010 1015 1020 Thr Ser Leu Gln Ser Val Asn Asp Gly Thr Trp His Glu Val Thr 1025 1030 1035 Leu Ser Met Thr Asp Pro Leu Ser Gln Thr Ser Arg Trp Gln Met 1040 1045 1050 Glu Val Asp Asn Glu Thr Pro Phe Val Thr Ser Thr Ile Ala Thr 1055 1060 1065 Gly Ser Leu Asn Phe Leu Lys Asp Asn Thr Asp Ile Tyr Val Gly 1070 1075 1080 Asp Arg Ala Ile Asp Asn Ile Lys Gly Leu Gln Gly Cys Leu Ser 1085 1090 1095 Thr Ile Glu Ile Gly Gly Ile Tyr Leu Ser Tyr Phe Glu Asn Val 1100 1105 1110 His Gly Phe Ile Asn Lys Pro Gln Glu Glu Gln Phe Leu Lys Ile 1115 1120 1125 Ser Thr Asn Ser Val Val Thr Gly Cys Leu Gln Leu Asn Val Cys 1130 1135 1140 Asn Ser Asn Pro Cys Leu His Gly Gly Asn Cys Glu Asp Ile Tyr 1145 1150 1155 Ser Ser Tyr His Cys Ser Cys Pro Leu Gly Trp Ser Gly Lys His 1160 1165 1170 Cys Glu Leu Asn Ile Asp Glu Cys Phe Ser Asn Pro Cys Ile His 1175 1180 1185 Gly Asn Cys Ser Asp Arg Val Ala Ala Tyr His Cys Thr Cys Glu 1190 1195 1200 Pro Gly Tyr Thr Gly Val Asn Cys Glu Val Asp Ile Asp Asn Cys 1205 1210 1215 Gln Ser His Gln Cys Ala Asn Gly Ala Thr Cys Ile Ser His Thr 1220 1225 1230 Asn Gly Tyr Ser Cys Leu Cys Phe Gly Asn Phe Thr Gly Lys Phe 1235 1240 1245 Cys Arg Gln Ser Arg Leu Pro Ser Thr Val Cys Gly Asn Glu Lys 1250 1255 1260 Thr Asn Leu Thr Cys Tyr Asn Gly Gly Asn Cys Thr Glu Phe Gln 1265 1270 1275 Thr Glu Leu Lys Cys Met Cys Arg Pro Gly Phe Thr Gly Glu Trp 1280 1285 1290 Cys Glu Lys Asp Ile Asp Glu Cys Ala Ser Asp Pro Cys Val Asn 1295 1300 1305 Gly Gly Leu Cys Gln Asp Leu Leu Asn Lys Phe Gln Cys Leu Cys 1310 1315 1320 Asp Val Ala Phe Ala Gly Glu Arg Cys Glu Val Asp Val Ser Ser 1325 1330 1335 Leu Ser Phe Tyr Val Ser Leu Leu Phe Trp Gln Asn Leu Phe Gln 1340 1345 1350 Leu Leu Ser Tyr Leu Ile Leu Arg Met Asn Asp Glu Pro Val Val 1355 1360 1365 Glu Trp Gly Glu Gln Glu Asp Tyr 1370 1375 15 1406 PRT Homo sapiens 15 Met Ala Leu Lys Asn Ile Asn Tyr Leu Leu Ile Phe Tyr Leu Ser Phe 1 5 10 15 Ser Leu Leu Ile Tyr Ile Lys Asn Ser Phe Cys Asn Lys Asn Asn Thr 20 25 30 Arg Cys Leu Ser Asn Ser Cys Gln Asn Asn Ser Thr Cys Lys Asp Phe 35 40 45 Ser Lys Asp Asn Asp Cys Ser Cys Ser Asp Thr Ala Asn Asn Leu Asp 50 55 60 Lys Asp Cys Asp Asn Met Lys Asp Pro Cys Phe Ser Asn Pro Cys Gln 65 70 75 80 Gly Ser Ala Thr Cys Val Asn Thr Pro Gly Glu Arg Ser Phe Leu Cys 85 90 95 Lys Cys Pro Pro Gly Tyr Ser Gly Thr Ile Cys Glu Thr Thr Ile Gly 100 105 110 Ser Cys Gly Lys Asn Ser Cys Gln His Gly Gly Ile Cys His Gln Asp 115 120 125 Pro Ile Tyr Pro Val Cys Ile Cys Pro Ala Gly Tyr Ala Gly Arg Phe 130 135 140 Cys Glu Ile Asp His Asp Glu Cys Ala Ser Ser Pro Cys Gln Asn Gly 145 150 155 160 Ala Val Cys Gln Asp Gly Ile Asp Gly Tyr Ser Cys Phe Cys Val Pro 165 170 175 Gly Tyr Gln Gly Arg His Cys Asp Leu Glu Val Asp Glu Cys Ala Ser 180 185 190 Asp Pro Cys Lys Asn Glu Ala Thr Cys Leu Asn Glu Ile Gly Arg Tyr 195 200 205 Thr Cys Ile Cys Pro His Asn Tyr Ser Gly Val Asn Cys Glu Leu Glu 210 215 220 Ile Asp Glu Cys Trp Ser Gln Pro Cys Leu Asn Gly Ala Thr Cys Gln 225 230 235 240 Asp Ala Leu Gly Ala Tyr Phe Cys Asp Cys Ala Pro Gly Phe Leu Gly 245 250 255 Asp His Cys Glu Leu Asn Thr Asp Glu Cys Ala Ser Gln Pro Cys Leu 260 265 270 His Gly Gly Leu Cys Val Asp Gly Glu Asn Arg Tyr Ser Cys Asn Cys 275 280 285 Thr Gly Ser Gly Phe Thr Gly Thr His Cys Glu Thr Leu Met Pro Leu 290 295 300 Cys Trp Ser Lys Pro Cys His Asn Asn Ala Thr Cys Glu Asp Ser Val 305 310 315 320 Asp Asn Tyr Thr Cys His Cys Trp Pro Gly Tyr Thr Gly Ala Gln Cys 325 330 335 Glu Ile Asp Leu Asn Glu Cys Asn Ser Asn Pro Cys Gln Ser Asn Gly 340 345 350 Glu Cys Val Glu Leu Ser Ser Glu Lys Gln Tyr Gly Arg Ile Thr Gly 355 360 365 Leu Pro Ser Ser Phe Ser Tyr His Glu Ala Ser Gly Tyr Val Cys Ile 370 375 380 Cys Gln Pro Gly Phe Thr Gly Ile His Cys Glu Glu Asp Val Asn Glu 385 390 395 400 Cys Ser Ser Asn Pro Cys Gln Asn Gly Gly Thr Cys Glu Asn Leu Pro 405 410 415 Gly Asn Tyr Thr Cys His Cys Pro Phe Asp Asn Leu Ser Arg Thr Phe 420 425 430 Tyr Gly Gly Arg Asp Cys Ser Asp Ile Leu Leu Gly Cys Thr His Gln 435 440 445 Gln Cys Leu Asn Asn Gly Thr Cys Ile Pro His Phe Gln Asp Gly Gln 450 455 460 His Gly Phe Ser Cys Leu Cys Pro Ser Gly Tyr Thr Gly Ser Leu Cys 465 470 475 480 Glu Ile Ala Thr Thr Leu Ser Phe Glu Gly Asp Gly Phe Leu Trp Val 485 490 495 Lys Ser Gly Ser Val Thr Thr Lys Gly Ser Val Cys Asn Ile Ala Leu 500 505 510 Arg Phe Gln Thr Val Gln Pro Met Ala Leu Leu Leu Phe Arg Ser Asn 515 520 525 Arg Asp Val Phe Val Lys Leu Glu Leu Leu Ser Gly Tyr Ile His Leu 530 535 540 Ser Ile Gln Val Asn Asn Gln Ser Lys Val Leu Leu Phe Ile Ser His 545 550 555 560 Asn Thr Ser Asp Gly Glu Trp His Phe Val Glu Val Ile Phe Ala Glu 565 570 575 Ala Val Thr Leu Thr Leu Ile Asp Asp Ser Cys Lys Glu Lys Cys Ile 580 585 590 Ala Lys Ala Pro Thr Pro Leu Glu Ser Asp Gln Ser Ile Cys Ala Phe 595 600 605 Gln Asn Ser Phe Leu Gly Gly Leu Pro Val Gly Met Thr Ser Asn Gly 610 615 620 Val Ala Leu Leu Asn Phe Tyr Asn Met Pro Ser Thr Pro Ser Phe Val 625 630 635 640 Gly Cys Leu Gln Asp Ile Lys Ile Asp Trp Asn His Ile Thr Leu Glu 645 650 655 Asn Ile Ser Ser Gly Ser Ser Leu Asn Val Lys Ala Gly Cys Val Arg 660 665 670 Lys Asp Trp Cys Glu Ser Gln Pro Cys Gln Ser Arg Gly Arg Cys Ile 675 680 685 Asn Leu Trp Leu Ser Tyr Gln Cys Asp Cys His Arg Pro Tyr Glu Gly 690 695 700 Pro Asn Cys Leu Arg Glu Tyr Val Ala Gly Arg Phe Gly Gln Asp Asp 705 710 715 720 Ser Thr Gly Tyr Val Ile Phe Thr Leu Asp Glu Ser Tyr Gly Asp Thr 725 730 735 Ile Ser Leu Ser Met Phe Val Arg Thr Leu Gln Pro Ser Gly Leu Leu 740 745 750 Leu Ala Leu Glu Asn Ser Thr Tyr Gln Tyr Ile Arg Val Trp Leu Glu 755 760 765 Arg Gly Arg Leu Ala Met Leu Thr Pro Asn Ser Pro Lys Leu Val Val 770 775 780 Lys Phe Val Leu Asn Asp Gly Asn Val His Leu Ile Ser Leu Lys Ile 785 790 795 800 Lys Pro Tyr Lys Ile Glu Leu Tyr Gln Ser Ser Gln Asn Leu Gly Phe 805 810 815 Ile Ser Ala Ser Thr Trp Lys Ile Glu Lys Gly Asp Val Ile Tyr Ile 820 825 830 Gly Gly Leu Pro Asp Lys Gln Glu Thr Glu Leu Asn Gly Gly Phe Phe 835 840 845 Lys Gly Cys Ile Gln Asp Val Arg Leu Asn Asn Gln Asn Leu Glu Phe 850 855 860 Phe Pro Asn Pro Thr Asn Asn Ala Ser Leu Asn Pro Val Leu Val Asn 865 870 875 880 Val Thr Gln Gly Cys Ala Gly Asp Asn Ser Cys Lys Ser Asn Pro Cys 885 890 895 His Asn Gly Gly Val Cys His Ser Arg Trp Asp Asp Phe Ser Cys Ser 900 905 910 Cys Pro Ala Leu Thr Ser Gly Lys Ala Cys Glu Glu Val Gln Trp Cys 915 920 925 Gly Phe Ser Pro Cys Pro His Gly Ala Gln Cys Gln Pro Val Leu Gln 930 935 940 Gly Phe Glu Cys Ile Ala Asn Ala Val Phe Asn Gly Gln Ser Gly Gln 945 950 955 960 Ile Leu Phe Arg Ser Asn Gly Asn Ile Thr Arg Glu Leu Thr Asn Ile 965 970 975 Thr Phe Gly Phe Arg Thr Arg Asp Ala Asn Val Ile Ile Leu His Ala 980 985 990 Glu Lys Glu Pro Glu Phe Leu Asn Ile Ser Ile Gln Asp Ser Arg Leu 995 1000 1005 Phe Phe Gln Leu Gln Ser Gly Asn Ser Phe Tyr Met Leu Ser Leu 1010 1015 1020 Thr Ser Leu Gln Ser Val Asn Asp Gly Thr Trp His Glu Val Thr 1025 1030 1035 Leu Ser Met Thr Asp Pro Leu Ser Gln Thr Ser Arg Trp Gln Met 1040 1045 1050 Glu Val Asp Asn Glu Thr Pro Phe Val Thr Ser Thr Ile Ala Thr 1055 1060 1065 Gly Ser Leu Asn Phe Leu Lys Asp Asn Thr Asp Ile Tyr Val Gly 1070 1075 1080 Asp Arg Ala Ile Asp Asn Ile Lys Gly Leu Gln Gly Cys Leu Ser 1085 1090 1095 Thr Ile Glu Ile Gly Gly Ile Tyr Leu Ser Tyr Phe Glu Asn Val 1100 1105 1110 His Gly Phe Ile Asn Lys Pro Gln Glu Glu Gln Phe Leu Lys Ile 1115 1120 1125 Ser Thr Asn Ser Val Val Thr Gly Cys Leu Gln Leu Asn Val Cys 1130 1135 1140 Asn Ser Asn Pro Cys Leu His Gly Gly Asn Cys Glu Asp Ile Tyr 1145 1150 1155 Ser Ser Tyr His Cys Ser Cys Pro Leu Gly Trp Ser Gly Lys His 1160 1165 1170 Cys Glu Leu Asn Ile Asp Glu Cys Phe Ser Asn Pro Cys Ile His 1175 1180 1185 Gly Asn Cys Ser Asp Arg Val Ala Ala Tyr His Cys Thr Cys Glu 1190 1195 1200 Pro Gly Tyr Thr Gly Val Asn Cys Glu Val Asp Ile Asp Asn Cys 1205 1210 1215 Gln Ser His Gln Cys Ala Asn Gly Ala Thr Cys Ile Ser His Thr 1220 1225 1230 Asn Gly Tyr Ser Cys Leu Cys Phe Gly Asn Phe Thr Gly Lys Phe 1235 1240 1245 Cys Arg Gln Ser Arg Leu Pro Ser Thr Val Cys Gly Asn Glu Lys 1250 1255 1260 Thr Asn Leu Thr Cys Tyr Asn Gly Gly Asn Cys Thr Glu Phe Gln 1265 1270 1275 Thr Glu Leu Lys Cys Met Cys Arg Pro Gly Phe Thr Gly Glu Trp 1280 1285 1290 Cys Glu Lys Asp Ile Asp Glu Cys Ala Ser Asp Pro Cys Val Asn 1295 1300 1305 Gly Gly Leu Cys Gln Asp Leu Leu Asn Lys Phe Gln Cys Leu Cys 1310 1315 1320 Asp Val Ala Phe Ala Gly Glu Arg Cys Glu Val Asp Leu Ala Asp 1325 1330 1335 Asp Leu Ile Ser Asp Ile Phe Thr Thr Ile Gly Ser Val Thr Val 1340 1345 1350 Ala Leu Leu Leu Ile Leu Leu Leu Ala Ile Val Ala Ser Val Val 1355 1360 1365 Thr Ser Asn Lys Arg Ala Thr Gln Gly Thr Tyr Ser Pro Ser Arg 1370 1375 1380 Gln Glu Lys Glu Gly Ser Arg Val Glu Met Trp Asn Leu Met Pro 1385 1390 1395 Pro Pro Ala Met Glu Arg Leu Ile 1400 1405 16 120 PRT Homo sapiens 16 Met Ala Asn Pro Gly Leu Gly Leu Leu Leu Ala Leu Gly Leu Pro Phe 1 5 10 15 Leu Leu Ala Arg Trp Gly Arg Ala Trp Gly Gln Ile Gln Thr Thr Ser 20 25 30 Ala Asn Glu Asn Ser Thr Val Leu Pro Ser Ser Thr Ser Ser Ser Ser 35 40 45 Asp Gly Asn Leu Arg Pro Glu Ala Ile Thr Ala Ile Ile Val Val Phe 50 55 60 Ser Leu Leu Ala Ala Leu Leu Leu Ala Val Gly Leu Ala Leu Leu Val 65 70 75 80 Arg Lys Leu Arg Glu Lys Arg Gln Thr Glu Gly Thr Tyr Arg Pro Ser 85 90 95 Ser Glu Glu Gln Val Gly Ala Arg Val Pro Pro Thr Pro Asn Leu Lys 100 105 110 Leu Pro Pro Glu Glu Arg Leu Ile 115 120 17 1307 PRT Homo sapiens 17 Met Ala Leu Ala Arg Pro Gly Thr Pro Asp Pro Gln Ala Leu Ala Ser 1 5 10 15 Val Leu Leu Leu Leu Leu Trp Ala Pro Ala Leu Ser Leu Leu Ala Gly 20 25 30 Gly Asn Ser Leu Glu Leu Cys Ser Glu Pro Lys Leu Ser Arg Val Gly 35 40 45 Gln Cys Gln Ala Gln Gly Thr Val Pro Ser Glu Pro Pro Ser Ala Cys 50 55 60 Ala Ser Asp Pro Cys Ala Pro Gly Thr Glu Cys Gln Ala Thr Glu Ser 65 70 75 80 Gly Gly Tyr Thr Cys Gly Pro Met Glu Pro Arg Gly Cys Ala Thr Gln 85 90 95 Pro Cys His His Gly Ala Leu Cys Val Pro Gln Gly Pro Asp Pro Thr 100 105 110 Gly Phe Arg Cys Tyr Cys Val Pro Gly Phe Gln Gly Pro Arg Cys Glu 115 120 125 Leu Asp Ile Asp Glu Cys Ala Ser Arg Pro Cys His His Gly Ala Thr 130 135 140 Cys Arg Asn Leu Ala Asp Arg Tyr Glu Cys His Cys Pro Leu Gly Tyr 145 150 155 160 Ala Gly Val Thr Cys Glu Met Glu Val Asp Glu Cys Ala Ser Ala Pro 165 170 175 Cys Leu His Gly Gly Ser Cys Leu Asp Gly Val Gly Ser Phe Arg Cys 180 185 190 Val Cys Ala Pro Gly Tyr Gly Gly Thr Arg Cys Gln Leu Asp Leu Asp 195 200 205 Glu Cys Gln Ser Gln Pro Cys Ala His Gly Gly Thr Cys His Asp Leu 210 215 220 Val Asn Gly Phe Arg Cys Asp Cys Ala Gly Thr Gly Tyr Glu Gly Thr 225 230 235 240 His Cys Glu Arg Glu Val Leu Glu Cys Ala Ser Ala Pro Cys Glu His 245 250 255 Asn Ala Ser Cys Leu Glu Gly Leu Gly Ser Phe Arg Cys Leu Cys Trp 260 265 270 Pro Gly Tyr Ser Gly Glu Leu Cys Glu Val Asp Glu Asp Glu Cys Ala 275 280 285 Ser Ser Pro Cys Gln His Gly Gly Arg Cys Leu Gln Arg Ser Asp Pro 290 295 300 Ala Leu Tyr Gly Gly Val Gln Ala Ala Phe Pro Gly Ala Phe Ser Phe 305 310 315 320 Arg His Ala Ala Gly Phe Leu Cys His Cys Pro Pro Gly Phe Glu Gly 325 330 335 Ala Asp Cys Gly Val Glu Val Asp Glu Cys Ala Ser Arg Pro Cys Leu 340 345 350 Asn Gly Gly His Cys Gln Asp Leu Pro Asn Gly Phe Gln Cys His Cys 355 360 365 Pro Asp Gly Tyr Ala Gly Pro Thr Cys Glu Glu Asp Val Asp Glu Cys 370 375 380 Leu Ser Asp Pro Cys Leu His Gly Gly Thr Cys Ser Asp Thr Val Ala 385 390 395 400 Gly Tyr Ile Cys Arg Cys Pro Glu Thr Trp Gly Gly Arg Asp Cys Ser 405 410 415 Val Gln Leu Thr Gly Cys Gln Gly His Thr Cys Pro Leu Ala Ala Thr 420 425 430 Cys Ile Pro Ile Phe Glu Ser Gly Val His Ser Tyr Val Cys His Cys 435 440 445 Pro Pro Gly Thr His Gly Pro Phe Cys Gly Gln Asn Thr Thr Phe Ser 450 455 460 Val Met Ala Gly Ser Pro Ile Gln Ala Ser Val Pro Ala Gly Gly Pro 465 470 475 480 Leu Gly Leu Ala Leu Arg Phe Arg Thr Thr Leu Pro Ala Gly Thr Leu 485 490 495 Ala Thr Arg Asn Asp Thr Lys Glu Ser Leu Glu Leu Ala Leu Val Ala 500 505 510 Ala Thr Leu Gln Ala Thr Leu Trp Ser Tyr Ser Thr Thr Val Leu Val 515 520 525 Leu Arg Leu Pro Asp Leu Ala Leu Asn Asp Gly His Trp His Gln Val 530 535 540 Glu Val Val Leu His Leu Ala Thr Leu Glu Leu Arg Leu Trp His Glu 545 550 555 560 Gly Cys Pro Ala Arg Leu Cys Val Ala Ser Gly Pro Val Ala Leu Ala 565 570 575 Ser Thr Ala Ser Ala Thr Pro Leu Pro Ala Gly Ile Ser Ser Ala Gln 580 585 590 Leu Gly Asp Ala Thr Phe Ala Gly Cys Leu Gln Asp Val Arg Val Asp 595 600 605 Gly His Leu Leu Leu Pro Glu Asp Leu Gly Glu Asn Val Leu Leu Gly 610 615 620 Cys Glu Arg Arg Glu Gln Cys Arg Pro Leu Pro Cys Val His Gly Gly 625 630 635 640 Ser Cys Val Asp Leu Trp Thr His Phe Arg Cys Asp Cys Ala Arg Pro 645 650 655 His Arg Gly Pro Thr Cys Ala Asp Glu Ile Pro Ala Ala Thr Phe Gly 660 665 670 Leu Gly Gly Ala Pro Ser Ser Ala Ser Phe Leu Leu Gln Glu Leu Pro 675 680 685 Gly Pro Asn Leu Thr Val Ser Phe Leu Leu Arg Thr Arg Glu Ser Ala 690 695 700 Gly Leu Leu Leu Gln Phe Ala Asn Asp Ser Ala Ala Gly Leu Thr Val 705 710 715 720 Phe Leu Ser Glu Gly Arg Ile Arg Ala Glu Val Pro Gly Ser Pro Ala 725 730 735 Val Val Leu Pro Gly Arg Trp Asp Asp Gly Leu Arg His Leu Val Met 740 745 750 Leu Ser Phe Gly Pro Asp Gln Leu Gln Asp Leu Gly Gln His Val His 755 760 765 Val Gly Gly Arg Leu Leu Ala Ala Asp Ser Gln Pro Trp Gly Gly Pro 770 775 780 Phe Arg Gly Cys Leu Gln Asp Leu Arg Leu Asp Gly Cys His Leu Pro 785 790 795 800 Phe Phe Pro Leu Pro Leu Asp Asn Ser Ser Gln Pro Ser Glu Leu Gly 805 810 815 Gly Arg Gln Ser Trp Asn Leu Thr Ala Gly Cys Val Ser Glu Asp Met 820 825 830 Cys Ser Pro Asp Pro Cys Phe Asn Gly Gly Thr Cys Leu Val Thr Trp 835 840 845 Asn Asp Phe His Cys Thr Cys Pro Ala Asn Phe Thr Gly Pro Thr Cys 850 855 860 Ala Gln Gln Leu Trp Cys Pro Gly Gln Pro Cys Leu Pro Pro Ala Thr 865 870 875 880 Cys Glu Glu Val Pro Asp Gly Phe Val Cys Val Ala Glu Ala Thr Phe 885 890 895 Arg Glu Gly Pro Pro Ala Ala Phe Ser Gly His Asn Ala Ser Ser Gly 900 905 910 Arg Leu Leu Gly Gly Leu Ser Leu Ala Phe Arg Thr Arg Asp Ser Glu 915 920 925 Ala Trp Leu Leu Arg Ala Ala Ala Gly Ala Leu Glu Gly Val Trp Leu 930 935 940 Ala Val Arg Asn Gly Ser Leu Ala Gly Gly Val Arg Gly Gly His Gly 945 950 955 960 Leu Pro Gly Ala Val Leu Pro Ile Pro Gly Pro Arg Val Ala Asp Gly 965 970 975 Ala Trp His Arg Val Arg Leu Ala Met Glu Arg Pro Ala Ala Thr Thr 980 985 990 Ser Arg Trp Leu Leu Trp Leu Asp Gly Ala Ala Thr Pro Val Ala Leu 995 1000 1005 Arg Gly Leu Ala Ser Asp Leu Gly Phe Leu Gln Gly Pro Gly Ala 1010 1015 1020 Val Arg Ile Leu Leu Ala Glu Asn Phe Thr Gly Cys Leu Gly Arg 1025 1030 1035 Val Ala Leu Gly Gly Leu Pro Leu Pro Leu Ala Arg Pro Arg Pro 1040 1045 1050 Gly Ala Ala Pro Gly Ala Arg Glu His Phe Ala Ser Trp Pro Gly 1055 1060 1065 Thr Pro Ala Pro Ile Leu Gly Cys Arg Gly Ala Pro Val Cys Ala 1070 1075 1080 Pro Ser Pro Cys Leu His Asp Gly Ala Cys Arg Asp Leu Phe Asp 1085 1090 1095 Ala Phe Ala Cys Ala Cys Gly Pro Gly Trp Glu Gly Pro Arg Cys 1100 1105 1110 Glu Ala His Val Asp Pro Cys His Ser Ala Pro Cys Ala Arg Gly 1115 1120 1125 Arg Cys His Thr His Pro Asp Gly Arg Phe Glu Cys Arg Cys Pro 1130 1135 1140 Pro Gly Phe Gly Gly Pro Arg Cys Arg Leu Pro Val Pro Ser Lys 1145 1150 1155 Glu Cys Ser Leu Asn Val Thr Cys Leu Asp Gly Ser Pro Cys Glu 1160 1165 1170 Gly Gly Ser Pro Ala Ala Asn Cys Ser Cys Leu Glu Gly Leu Ala 1175 1180 1185 Gly Gln Arg Cys Gln Val Pro Thr Leu Pro Cys Glu Ala Asn Pro 1190 1195 1200 Cys Leu Asn Gly Gly Thr Cys Arg Ala Ala Gly Gly Val Ser Glu 1205 1210 1215 Cys Ile Cys Asn Ala Arg Phe Ser Gly Gln Phe Cys Glu Val Ala 1220 1225 1230 Lys Gly Leu Pro Leu Pro Leu Pro Phe Pro Leu Leu Glu Val Ala 1235 1240 1245 Val Pro Ala Ala Cys Ala Cys Leu Leu Leu Leu Leu Leu Gly Leu 1250 1255 1260 Leu Ser Gly Ile Leu Ala Ala Arg Lys Arg Arg Gln Ser Glu Gly 1265 1270 1275 Thr Tyr Ser Pro Ser Gln Gln Glu Val Ala Gly Ala Arg Leu Glu 1280 1285 1290 Met Asp Ser Val Leu Lys Val Pro Pro Glu Glu Arg Leu Ile 1295 1300 1305 

What is claimed is:
 1. A method of identifying a candidate branching morphogenesis modulating agent, said method comprising the steps of: (a) providing an assay system comprising a CRB polypeptide or nucleic acid; (b) contacting the assay system with a test agent under conditions whereby, but for the presence of the test agent, the system provides a reference activity; and (c) detecting a test agent-biased activity of the assay system, wherein a difference between the test agent-biased activity and the reference activity identifies the test agent as a candidate branching morphogenesis modulating agent.
 2. The method of claim 1 wherein the assay system includes a screening assay comprising a CRB polypeptide, and the candidate test agent is a small molecule modulator.
 3. The method of claim 2 wherein the screening assay is a binding assay.
 4. The method of claim 1 wherein the assay system includes a binding assay comprising a CRB polypeptide and the candidate test agent is an antibody.
 5. The method of claim 1 wherein the assay system includes an expression assay comprising a CRB nucleic acid and the candidate test agent is a nucleic acid modulator.
 6. The method of claim 5 wherein the nucleic acid modulator is an antisense oligomer.
 7. The method of claim 6 wherein the nucleic acid modulator is a PMO.
 8. The method of claim 1 wherein the assay system comprises cultured cells or a non-human animal expressing CRB, and wherein the assay system includes an assay that detects an agent-biased change in branching morphogenesis
 9. The method of claim 8 wherein the branching morphogenesis is angiogenesis.
 10. The method of claim 8 wherein the assay system comprises cultured cells.
 11. The method of claim 10 wherein the assay detects an event selected from the group consisting of cell proliferation, cell cycling, apoptosis, tubulogenesis, cell migration, cell sprouting and response to hypoxic conditions.
 12. The method of claim 10 wherein the assay detects tubulogenesis or cell migration or cell sprouting, and wherein the assay system comprises the step of testing the cellular response to stimulation with at least two different pro-angiogenic agents.
 13. The method of claim 10 wherein the assay detects tubulogenesis or cell migration, and wherein cells are stimulated with an inflammatory angiogenic agent.
 14. The method of claim 8 wherein the assay system comprises a non-human animal.
 15. The method of claim 14 wherein the assay system includes a matrix implant assay, a xenograft assay, a hollow fiber assay, or a transgenic tumor assay.
 16. The method of claim 15 wherein the assay system includes a transgenic tumor assay that includes a mouse comprising a RIP1-Tag2 transgene.
 17. The method of claim 1, comprising the additional steps of: (d) providing a second assay system comprising cultured cells or a non-human animal expressing CRB, (e) contacting the second assay system with the test agent of (b) or an agent derived therefrom under conditions whereby, but for the presence of the test agent or agent derived therefrom, the system provides a reference activity; and (f) detecting an agent-biased activity of the second assay system, wherein a difference between the agent-biased activity and the reference activity of the second assay system confirms the test agent or agent derived therefrom as a candidate branching morphogenesis modulating agent, and wherein the second assay system includes a second assay that detects an agent-biased change in an activity associated with branching morphogenesis.
 18. The method of claim 17 wherein second assay detects an agent-biased change in an activity associated with angiogenesis.
 19. The method of claim 17 wherein the second assay system comprises cultured cells.
 20. The method of claim 19 wherein the second assay detects an event selected from the group consisting of cell proliferation, cell cycling, apoptosis, tubulogenesis, cell migration, cell sprouting and response to hypoxic conditions.
 21. The method of claim 20 wherein the second assay detects tubulogenesis or cell migration or cell sprouting, and wherein the second assay system comprises the step of testing the cellular response to stimulation with at least two different pro-angiogenic agents.
 22. The method of claim 20 wherein the assay detects tubulogenesis or cell migration, and wherein cells are stimulated with an inflammatory angiogenic agent.
 23. The method of claim 17 wherein the assay system comprises a non-human animal.
 24. The method of claim 23 wherein the assay system includes a matrix implant assay, a xenograft assay, a hollow fiber assay, or a transgenic tumor assay.
 25. The method of claim 24 wherein the assay system includes a transgenic tumor assay that includes a mouse comprising a RIP1-Tag2 transgene.
 26. A method of modulating branching morphogenesis in a mammalian cell comprising contacting the cell with an agent that specifically binds a CRB polypeptide or nucleic acid.
 27. The method of claim 26 wherein the agent is administered to a mammalian animal predetermined to have a pathology associated with branching morphogenesis.
 28. The method of claim 26 wherein the agent is a small molecule modulator, a nucleic acid modulator, or an antibody.
 29. The method of claim 26 wherein the branching morphogenesis is angiogenesis
 30. The method of claim 29 wherein tumor cell proliferation is inhibited.
 31. A method for diagnosing a disease in a patient comprising: (a) obtaining a biological sample from the patient; (b) contacting the sample with a probe for CRB expression; (c) comparing results from step (b) with a control; and (d) determining whether step (c) indicates a likelihood of disease.
 32. The method of claim 31 wherein said disease is cancer.
 33. The method according to claim 32, wherein said cancer is colon, kidney, uterus, prostate, or skin cancer. 